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GLYCOSIDASE ENZYMES 
BACKGROUND OF THE INVENTION 

1 . Field of the Inventions 

This invention relates to newly identified polynucleotides, polypeptides encoded by 
5 such polynucleotides, the use of such polynucleotides and polypeptides, as well as the 

production and isolation of such polynucleotides and polypeptides. More particularly, the 
polynucleotides and polypeptides of the present invention has been putatively identified as 
glucosidases, a-galactosidases, p-galactosidases, B-mannosidases, B-mannanases, 
endoglucanases, and pullalanases. 

10 2. Description of Related Art 

The glycosidic bond of p-galactosides can be cleaved by different classes of 
enzymes: (i) phospho-P-galactosidases (EC3.2.L85) are specific for a phosphorylated 
substrate generated via phosphoenolpyruvate phosphotransferase system (PTS)-dependent 

uptaker(ii)-typical-p-galaetosidasesfEG-3v2^W2^-V^ 

15 enzyme, which are relatively specific for p-galactosides; and (iii) p-glucosidases (EC 

3.2T.21) such as the enzymes of Agrobocterium faecalis, Clostridium thermocellum, 
Pyrococcus fiiriosus or Sulfolobus solfataricus (Day, A.G. and Withers, S.G., (1986) 
Purification and characterization of a P-glucosidase from Alcaligenes faecalis. Can. J. 
Biochem. CelL Biol, 64, 914-922; Kengen, S.W.M., et aL (1993) Eur. J. Biochem., 213, 

20 305-3 1 2; Ait, N., Cruezet, N. and Cattaneo, J, (1 982) Properties of p-glucosidase purified 

from Clostridium thermocellum, J. Gen. Microbiol, 128, 569-577; Grogan, D.W. (1991) 
Evidence that P-galactosidase of Sulfolobus solfataricus is only one of several activities of 
a thermostable P-D-glycodiase. AppL Environ. Microbiol. 57, 1644-1649). Members of 
the latter group, although highly specific with respect to the p-anomeric configuration of 

25 the glycosidic linkage, often display a rather relaxed substrate specificity and hydrolyze p- 

glucosides as well as P-fucosides and p-galactosides. 

1 
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Generally, a-galactosidases are enzymes that catalyze the hydrolysis of galactose 
groups on a polysaccharide backbone or hydrolyze the cleavage of di- or oligosaccharides 

comprising galactose. 

Generally, B-mannanases are enzymes that catalyze the hydrolysis of mannose 
groups internally on a polysaccharide backbone or hydrolyze the cleavage of di- or 
oligosaccaharides comprising mannose groups. B-mannosidases hydrolyze non-reducing, 
tenninal mannose residues on a mannose-containing polysaccharide and the cleavage of di- 
or oligosaccaharides comprising mannose groups. 

Guar gum is a branched galactomannan polysaccharide composed of P-1,4 linked 
mannose backbone with a-1 ,6 linked galactose side chains. The enzymes required for the 
degradation of guar are p-mannanase, P-mannosidase and a-galactosidase. p-mannanase 
hydrolyses the mannose backbone internally and P-mannosidase hydrolyses non-reducing, 
terminal mannose residues, a-galactosidase hydrolyses a-linked galactose groups. 

Galactomannan polysaccharides and the enzymes that degrade them have a variety 
of applications. Guar is commonly used as a thickening agent in food and is utilized in 
hydraulic fracturing in oil and gas recovery. Consequently, galactomannanases are 
industrially relevant for the degradation and modification of guar. Furthermore, a need 
exists for thermostable galactomannases that are active in extreme conditions associated 
with drilling and well stimulation. 

There are other applications for these enzymes in various industries, such as in the 
beet sugar industry. 20-30% of the domestic U.S. sucrose consumption is sucrose from 
sugar beets. Raw beet sugar can contain a small amount of raffmose when the sugar beets 
are stored before processing and rotting begins to set in. Raffmose inhibits the 
crystallization of sucrose and also constitutes a hidden quantity of sucrose. Thus, there is 
merit to eliminating raffmose from raw beet sugar. a-Galactosidase has also been used as 
a digestive aid to break down raffmose, stachyose, and verbascose in such foods as beans 

and other gassy foods. 

P-galactosidases which are active and stable at high temperatures appear to be 
superior enzymes for the production of lactose-free dietary milk products (Chaplin, M.F. 
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and Bucke, C. (1990) In: Enzyme Technology, pp. 159-160, Cambridge University Press, 
Cambridge, UK). Also, several studies have demonstrated the applicability of P- 
galactosidases to the enzymatic synthesis of oligosaccharides via transglycosylation 
reactions (Nilsson, K.G.I. (1988) Enzymatic synthesis of oligosaccharides. Trends 

5 Biotechnol. 6, 156-264; Cote, G.L. and Tao, B.Y. (1990) Oligosaccharide synthesis by 

enzymatic transglycosylation. Glycoconjugate J. 7, 145-162). Despite the commercial 
potential, only a few P-galactosidases of thermophiles have been characterized so far. Two 
genes reported are p-galactoside-cleaving en2ymes of the hyperthermophilic bacterium 
Thermotoga maritima, one of the most thermophilic organotrophic eubacteria described to 

10 date (Huber, R., Langworthy, T.A., Konig, H., Thomm, M., Woese, C.R., Sleytr, U.B. and 

Stetter, K.O. (1986) T. martima sp. nov, represents a new genus of unique extremely 
thermophilic eubacteria growing up to 90°C, Arch. Microbiol. 144, 324-333) one of the 
most thermophilic organotrophic eubacteria described to date. The gene products have been 
identified as a P-galactosidase and a P-glucosidase. 

15 Pullulanase is well known as a debranching enzyme of pullulan and starch. The 

enzyme hydrolyzes a-l,6-glucosidic linkages on these polymers. Starch degradation for 

the production or sweeteners-(glucose or-maltose)-is a-very important industriaTapplieat^ — 

of this enz\'me. The degradation of starch is developed in two stages. The first stage 
involves the liquefaction of the substrate with a-amylase, and the second stage, or 

20 saccharification stage, is performed by B-amylase with pullalanase added as a debranching 

enzyme, to obtain better yields. 

Endoglucanases can be used in a variety of industrial applications. For instance, the 
endoglucanases of the present invention can hydrolyze the internal B-l,4-glycosidic bonds 
in cellulose, which may be used for the conversion of plant biomass into fuels and 

25 chemicals. Endoglucanases also have applications in detergent formulations, the textile 

industry, in animal feed, in waste treatment, and in the fruit juice and brewing industry for 
the clarification and extraction of juices. 
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Brief Description of the Drawings 

The following drawings are illustrative of embodiments of the invention and are not 
meant to limit the scope of the invention as encompassed by the claims. 

Figures la-b are the full-length DNA and corresponding deduced amino acid 
sequence of Ml ITL of the present invention. Sequencing was performed using a 378 
automated DNA sequencer for all sequences of the present invention (Applied Biosystems, 

bic.)- 

Figure 2 is an illustration of the full-length DNA and corresponding deduced amino 
acid sequence of OC1/4V-33B/G. 

Figure 3 is an illustration of the full-length DNA and corresponding deduced amino 

acid sequence of F1-12G. 

Figures 4a-b are the full-length DNA and corresponding deduced amino acid 

sequence of 9N2-3 1 B/G. 

Figures 5a-b are the full-length DNA and corresponding deduced amino acid 

sequence of MSB8-6G. 

Figure 6 is the full-length DNA and corresponding deduced amino acid sequence 

of AEDni2RA-18B/G. 

Figures 7a-b are the full-length DNA and correspondmg deduced amino acid 

sequence of GC74-22G. 

Figures 8a-b are the full-length DNA and corresponding deduced amino acid 

sequence of VC1-7G1. 

Figures 9a-c are the fiilHength DNA and corresponding deduced amino acid 

sequence of 37GP 1 . 

Figures lOa-c are the full-length DNA and corresponding deduced amino acid 

25 sequence of 6GC2, 

Figures lla-d are the full-length DNA and corresponding deduced amino acid 

sequence of 6GP2. 

Figures 12a-c are the full-length DNA and corresponding deduced amino acid 
sequence of 63GB 1. 



20 
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Figures 13a-b are the full-length DNA and corresponding deduced amino acid 
sequence of OC1/4V. 

Figures 14a-e are the full-length DNA and corresponding deduced amino acid 
sequence of 6GP3. 

5 Figures I5a-d are the full-length DNA and corresponding deduced amino acid 

sequence of Thermotoga maritima MSB8-6GP2. 

Figures 16a-c are the full-length DNA and corresponding deduced amino acid 
sequence of Thermotoga maritima MSB8-6GB4. 

Figures 17a-d are the full-length DNA and corresponding deduced amino acid 
1 0 sequence of Banki goiddi 3 7GP4 . 

Figures 18a-b are the full-length DNA and corresponding deduced amino acid- 
sequence of Pyrococcus ftiriosus VC1-7EGL 



SUMMARY OF THE INVENTION 

In a preferred embodiment of the present invention, there are provided isolated 

i5_ nucleic_acids„(polynucleotides)jvvhich_encode mature^enzymes-having-the-dedu 

acid sequences of Figures 1-18 (SEQ ED NOS: 15-28 and 61-64). 

In another embodiment, the invention provides a method for producing a 
polypeptide including culturing host cells containing the polynucleotide of Figures 1-18 and 
expressing from the host ceil a polypeptide encoded by the polynucleotide and isolating the 
20 polypeptide. 

In another embodiment, the invention provides an enzyme selected from the group 
consisting of an enzyme having an amino acid sequence set forth in SEQ ID NOS: 1 5-28 
or 61 -64 and an en2yme which has at least 30 consecutive amino acid residue as an enzyme 
having an amino acid sequence set forth in SEQ ID NOS: 15-28 or 61-64. 
25 In yet another embodiment, the invention provides a method for generating glucose 

from soluble cell oligosaccharides which includes contacting a sample containing 
oligosaccharides with an effective amount of an enzyme selected from the group of 

5 
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enzymes having the amino acid sequence set forth in SEQ ID NOS: 15-28, 61-63 and 64 

such that glucose is produced 

The publications discussed herein are provided solely for their disclosure prior to 

the filing date of the present application. Nothing herein is to be construed as an 
admission that the invention is not entitled to antedate such disclosure by virtue of prior 
invention. 

Definitions 

"Monosaccharide", as used herein, refers to a single polyhydroxy aldehyde or 
ketone unit. 

"Oligosaccharide", as used herein, consist of short chains of monosaccharide units 
joined together by covalenl bonds. Of these, the most abundant are the disaccharides, 
which have two monosaccharide units. 

"Polysaccharide", as used herein, consists of long chains having many 

monosaccharide units. 

The term "gene" means the segment of DNA involved in producing a polypeptide 
chain; it includes regions preceding and following the coding region (leader and trailer) as 
well as intervening sequences (introns) between individual coding segments (exons). 

A coding sequence is "operably linked to" another coding sequence when RNA 
polymerase will transcribe the two coding sequences into a single mRNA, which is then 
translated into a single polypeptide having amino acids derived from both coding 
sequences. The coding sequences need not be contiguous to one another so long as the 
expressed sequences ultimately process to produce the desired protein. 

"Recombinant" enzymes refer to enzymes produced by recombinant DNA 
techniques: /. e., produced from cells transformed by an exogenous DNA construct encoding 
the desired enzyme. "Synthetic" enzymes are those prepared by chemical synthesis. 

A DNA "coding sequence of or a "nucleotide sequence encoding" a particular 
enzyme, is a DNA sequence which is transcribed and translated into an enzyme when 
placed under the control of appropriate regulatory sequences. 
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Detailed Description of the Invention 

The polynucleotides and polypeptides of the present invention have been identified 
as glucosidases. a-galactosidases, p-galactosidases, B-mannosidases, B-mannanases, 
endoglucanases, and puUalanases as a result of their enzymatic activity. 
5 In accordance with one aspect of the present invention, there are provided novel 

enzymes, as well as active fragments, analogs and derivatives thereof. 

In accordance with another aspect of the present invention, there are provided 
isolated nucleic acid molecules encoding the enzymes of the present invention including 
mRNAs, cDNAs, genomic DNAs as well as active analogs and fragments of such enzymes. 
10 In accordance with yet a further aspect of the present invention, there is provided 

a process for producing such polypeptides by recombinant techniques comprising culturing 
recombinant prokaryotic and/or eukaryotic host cells, containing a nucleic acid sequence 
of the present invention, under conditions promoting expression of said enzymes and 
subsequent recovery of said enzymes. 
15 In accordance with yet a further aspect of the present invention, there is provided 

a process for utilizing such enzymes, or polynucleotides encoding such enzymes for 

hydrolyzing lacro^p to gn1ncto<;e and glucose for use in the food processing industr y, the _ 

pharmaceutical industry, for example, to treat intolerance to lactose, as a diagnostic reporter 
molecule, in com wet milling, in the fhiit juice industry, in baking, in the textile industry 
20 and in the detergent industry. 

In accordance with yet a further aspect of the present invention, there is provided 
a process for utilizing such enzymes for hydrolyzing guar gum (a galactomannan 
polysaccharide) to remove non-reducing teraiinal mannose residues. Further 
polysaccharides such as galactomannan and the enzymes according to the invention that 
25 degrade them have a variety of applications. Guar gum is commonly used as a thickening 

agent in food and also is utilized in hydraulic fracturing in oil and gas recovery. 
Consequently, mannanases are industrially relevant for the degradation and modification 
of guar gums. Furthemiore, a need exists for thermostable mannases that are active in 
extreme conditions associated with drilling and well stimulation. 

7 
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In accordance with yet a ftirther aspect of the present invention, there are also 
provided nucleic acid probes comprising nucleic acid molecules of sufficient length to 
specifically hybridize to a nucleic acid sequence of the present invention. 

In accordance with yet a further aspect of the present invention, there is provided 
a process for utilizing such enzymes, or polynucleotides encoding such enzymes, for in 
vitro purposes related to scientific research- for example, to generate probes for identifying 
similar sequences which might encode similar enzymes from other organisms by using 
certain regions, i.e.. conserved sequence regions, of the nucleotide sequence. 

These and other aspects of the present invention should be apparent to those skilled 
in the art from the teachings herein. 

The polynucleotides of this invention were originally recovered from genomic gene 
libraries derived from the following organisms: 

MllTL is a new species of Desulfurococcus isolated from Diamond Pool in 
Yellowstone National Park. The organism grows optimally at 85-88°C, pH 7.0 in a low salt 
medium containing yeast extract, peptone, and gelatin as substrates with a N,/CO, gas 
phase. 

OC1/4V is from the genus Thermotoga. The organism was isolated from 
Yellowstone National Park. It grows optimally at 75°C in a low sah medium with cellulose 
as a substrate and in gas phase. 

Pyrococcus furiosusVCX and (7EG1) is from the genus Pvrococcu^. VCl was 

isolated from Vulcano, Italy. It grows optimally at lOO'C in a high sah medium (marine) 
containing elemental sulfur, yeast extract, peptone and starch as substrates and N, in gas 
phase. 

Staphylothermmmarinus Fl is a firom the genus Staphylothermus. Fl was isolated 
from Vulcano, Italy. It grows optimally at 85 °C, pH 6.5 in high salt medium (marine) 
containing elemental sulfur and yeast extract as substrates and N, in gas phase. 



8 
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Thermococciis 9N-2 is from the genus Thermococcus 9N-2 was isolated from 
diffuse vent fluid in the East Pacific Rise. It is a strict anaerobe that grows optimally at 
87"C. 

Thermotoga maritima MSB8 and MSB8 (Clone # 6GP2 and 6GB4) is from the 

5 genus Thermotogo^ and was isolated from Vulcano, Italy. MSB8 grows optimally at 85 °C, 

pH 6,5 in a high salt medium (marine) containing starch and yeast extract as substrates and 
N2 in gas phase. 

Thermococcus alcaliphilus AEDII12RA is from the genus Thermococcus, 
AEDII12RA grows optimally at 85°C, pH 9.5 in a high salt medium (marine) containing 
10 polysulfides and yeast extract as substrates and in gas phase. 

Thermococcus chitonophagus GC74 is from the genus Thermococcus. GC74 grows 
optimally at 85 °C, pH 6.0 in a high salt medium (marine) containing chitin, meat extract, 
elemental sulftir and yeast extract as substrates and in gas phase. AEPII la grows 
optimally at 85 °C at pH 6,5 in marine medium under anaerobic conditions. It has many 
15 substrates. Bankia gouldi is from the genus Bankia. 

Accordingly, the polynucleotides and enzymes encoded thereby are identified by 
the organism from whiclithey were isolated, and are sometimes hereinafter referred to as 
"Ml ITL" (Figure 1 and SEQ ID NOS:l and 15), "OC1/4V-33B/G" (Figure 2 and SEQ ED 
NOS:2 and 16), "F1-12G" (Figure 3 and SEQ ID NOS:3 and 17), "9N2-31B/G** (Figure 4 

20 and SEQ ID NOS:4 and 18), "MSB8" (Figure 5 and SEQ ID NOS:5 and 19), "AEDII12RA- 

18B/G" (Figure 6 and SEQ ID NOS:6 and 20), "GC74-22G" (Figure 7 and SEQ ID NOS:7 
and 21), "VCl-7Gr (Figure 8 and SEQ IDNOS:8 and 22), "37GPr (Figure 9 and SEQ 
ID NOS: 9 and 23), "6GC2" (Figure 10 and SEQ ID NOS: 10 and 24), "6GP2" (Figure 1 1 
and SEQ ID NOS: 11 and 25), "AEPII la" (Figure 12 and SEQ ID NOS: 12 and 26), 

25 "OC1/4V" (Figure 13 and SEQ ID NOS: 13 and 27), and "6GP3" (Figure 14 and SEQ ID 

NOS:28), "MSB8-6GP2" (Figure 1 5 and SEQ ID NOS:57 and 61), "MSB8-6GB4"(Figure 
16 and SEQ ID NOS:58 and 62),"VC1-7EG1 "(Figure 17 and SEQ ID NOS:59 and 63), 
and 37GP4 (Figure 18 and SEQ ID NOS:60 and 64). 

9 



BNSDOCID: <WO_9824799A1_l_> 



wo 98/24799 



PCT/US97/22623 



The polynucleotides and polypeptides of the present invention show identity at the 
nucleotide and protein level to known genes and proteins encoded thereby as shown in 
Table 1. 

Table 1 



Clone 


Gene/Protein with 
Closest Homology 


Protein 
Identity 


Nucleic 

Acid 
Identity 


M11TL-29G 


Sulfolobus sulfataricus 
DSM 1616/Pl,p- 
galactosidase 


51% 


55% 


OC1/4V-33B/G 


Caldocellum 
saccharolyticum, P- 
^lucosidase 


52% 


57% 


Staphylothermus 
mariniis F1-12G 


Bacillus polymyxa, p- 
^alactosidase 


36% 


48% 


Thermococcus 9N2- 
31B/G 


Sulfolobus sulfataricus 
ATCC 49255/MT4, p- 
ealactosidase 


51% 


50% 


Thermotoga maritima 
MSB8-6G 


Clostridium themiocellum 
belB 


45% 


53% 


Thermococcus 
AEDni2RA-18B/G 


Bacillus polymyxa, P- 
^alactosidase 


34% 


48% 


Thermococcus 
chitonophagiis GC74- 
22G 


Sulfolobus sulfataricus 
ATCC 49255/MT4, P- 
salactosidase 


46% 


54% 



10 
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PyrococcKS furiosiis 
VC1-7G1 


Sulfolobus 
suifatancus/MT-4 p- 
ealactosidase 


46.4% 


52.5% 


Thermotoga maritima 

a-galactosidase 

(6GC2) 


Pediococcus pentosaceaus 
a-galactosidase 


49% 


29% 


Thermotoga maritima 
B-mannanase (6GP2) 


Aspergillus aculeatus 
mannanase 


56% 


37% 


AEPII laii- 
mannosidase f63GBl) 


Sulfolobus solfactaricus B- 
galactosidase 


78% 


56% 


OC1/4V 

endoglucanase 

(33GP1) 


Clostridium thermocellum 
endo-1 ,4-B-endoglucanase 


65% 


43% 


Thermotoga maritima 
pullalanase (6GP3) 


Caldocellum 
saccharolyticum a- 
destrom 6 


72 


53 




glucanohydralase 






Bankia goiddi mix 

Endoglucanase 

f37GPl) 


None available 







The polynucleotides and enzymes of the present invention show homology to each 
other as shown in Table 2. 



II 
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Table 2 



Clone 


Gene/Protein with 
Closest Homology 


Protein 
Identity 


Nucleic 

Acid 
Identity 


Staphylothermus 
mariniis F1-12G 


Thermococcus 
^alactosidase, glucosidase 


55% 


57% 


Thermococcus 9N2- 
31B/G 


Thermococcus 
chitonophagus GC74- 
22G-glucosidase' 


74% 


66% 


Pyrococcus fiiriosus 
VC1-7G1 


Pyrococcus furiosus VC 1 - 
7B/G p-galactosidase 


46.4% 


54% 



10 



15 



20 



All the clones identified in Tables 1 and 2 encode polypeptides which have a- 
glycosidase or p-glycosidase activity. 

This invention, in addition to the isolated nucleic acid molecules encoding the 
enzymes of the present invention, also provide substantially similar sequences. Isolated 
nucleic acid sequences are substantially similar if: (i) they are capable of hybridizing under 
conditions hereinafter described, to the polynucleotides of SEQ ID NOS: 1 -14 and 57-60; 
(ii) or they encode DNA sequences which are degenerate to the polynucleotides of SEQ ID 
NOS: 1-14 and 57-60. Degenerate DNA sequences encode the amino acid sequences of 
SEQ ID NOS:15-28 and 61-64, but have variations in the nucleotide coding sequences. As 
used herein, substantially similar refers to the sequences having similar identity to the 
sequences of the instant invention. The nucleotide sequences that are substantially the same 
can be identified by hybridization or by sequence comparison. Enzyme sequences that are 
substantially the same can be identified by one or more of the following: proteolytic 
digestion, gel electrophoresis and/or microsequencing. 

One means for isolating the nucleic acid molecules encoding the enzymes of the 
present invention is to probe a gene library with a natural or artificially designed probe 
using art recognized procedures (see, for example: Current Protocols in Molecular Biology, 
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Ausubei F.M. et al (EDS.) Green Publishing Company Assoc, and John Wiley Interscience, 
New York, 1989, 1992). It is appreciated to one skilled in the art that the polynucleotides 
of SEQ ID NOS: 1-14 and 57-60 or fragments thereof (comprising at least 12 contiguous 
nucleotides), are particularly useful probes. Other particular useful probes for this purpose 

5 are hybridizable fragments to the sequences of SEQ ID NOS: 1-14 and 57-60 (/.e., 

comprising at least 12 contiguous nucleotides). 

With respect to nucleic acid sequences which hybridize to specific nucleic acid 
sequences disclosed herein, hybridization may be carried out under conditions of reduced 
stringency, medium stringency or even stringent conditions. As an example of 

10 oligonucleotide hybridization, a polymer membrane containing immobilized denatured 

nucleic acids is fu^t prehybridized for 30 minutes at 45 °C in a solution consisting of 0.9 M 
NaCL 50 mM NaH.P04, pH 7.0, 5.0 mM Na>EDTA, 0.5% SDS, lOX Denhardt's, and 0.5 
mg/ml polyriboadenylic acid. Approximately 2X10^ cpm (specific activity 4-9 X lt> 
cpm/ug) of ^-P end-labeled oligonucleotide probe are then added to the solution. After 12- 

15 16 hours of incubation, the membrane is washed for 30 minutes at room temperature in IX 

SET (150 mM NaCl, 20 mM Tris hydrochloride, pH 7.8, 1 mM Na.EDTA) containing 0.5% 

SDSr^followed-by-a^O-minute-wash-in-freshJ-X-SEXat-Tm-l-O-^^ 

probe. The membrane is then exposed to auto-radiographic film for detection of 
hybridization signals. 

20 Stringent conditions means hybridization will occur only if there is at least 90% 

identity, preferably at least 95% identity and most preferably at least 97% identity between 
the sequences. Further, it is understood that a section of a 100 bps sequence that is 95 bps 
in length has 95% identity with the 1090 bps sequence from which it is obtained. See J. 
Sambrook et al. Molecular Cloning, A Laboratory Manual 2d Ed,, Cold Spring Harbor 

23 Laboratory (1989) which is hereby incorporated by reference in its entirety. Also, it is 

understood that a fragment of a 100 bps sequence that is 95 bps in length has 95% identity 
vsdth the 100 bps sequence from which it is obtained. 

As used herein, a first DNA (RNA) sequence is at least 70% and preferably at least 
80% identical to another DNA (RNA) sequence if there is at least 70% and preferably at 
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least a 80% or 90% identity, respectively, between the bases of the first sequence and the 
bases of the another sequence, when properly aligned with each other, for example when 
aligned by BLASTN. 

"Identity" as the term is used herein, refers to a polynucleotide sequence which 
comprises a percentage of the same bases as a reference polynucleotide (SEQ ID NOS:1.14 
and 57-60). For example, a polynucleotide which is at least 90% identical to a reference 
polynucleotide, has polynucleotide bases which are identical in 90% of the bases which 
make up the reference polynucleotide and may have different bases in 10% of the bases 
which comprise that polynucleotide sequence. 

The present invention relates polynucleotides which differ from the reference 
polynucleotide such that the changes are silent changes, for example the change do not alter 
the amino acid sequence encoded by the polynucleotide. The present invention also relates 
to nucleotide changes which result in amino acid substitutions, additions, deletions, fusions 
and truncations in the polypeptide encoded by the reference polynucleotide. In a preferred 
aspect of the invention these polypeptides retain the same biological action as the 
polypeptide encoded by the reference polynucleotide. 

It is also appreciated that such probes can be and are preferably labeled v^ih an 
analytically detectable reagent to facilitate identification of the probe. Useful reagents 
include but are not limited to radioactivity, fluorescent dyes or enzymes capable of 
catalyzing the fomiation of a detectable product. The probes are thus useful to isolate 
complementary copies of DNA from other sources or to screen such sources for related 
sequences. 

The polynucleotides of this invention were recovered from genomic gene libraries 
from the organisms listed in Table 1. For example, gene libraries can be generated in the 
Lambda ZAP 11 cloning vector (Stratagene Cloning Systems). Mass excisions can be 
perfomied on these libraries to generate libraries in the pBluescript phagemid. Libranes 
are thus generated and excisions performed according to the protocols/methods hereinafter 
described. 
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The excision libraries are introduced into the E. coli strain BW14893 F'kanlA. 
Expression clones are then identified using a high temperature filter assay. Expression 
clones encoding several glucanases and several other glycosidases are identified and 
repurified. The polynucleotides, and enzymes encoded thereby, of the present invention, 
5 yield the activities as described above. 

The coding sequences for the enzymes of the present invention were identified by 
screening the genomic DNAs prepared for the clones having glucosidase or galactosidase 
activity. 

An example of such an assay is a high temperature filter assay wherein expression 
10 clones were identified by use of high temperature filter assays using buffer Z (see recipe 

below) containing 1 mg/ml of the substrate 5-bromo-4-chloro-3-indolyl-P-D- 
glucopyranoside (XGLU) (Diagnostic Chemicals Limited or Sigma) after introducing an 
excision library into the E. coli strain BW14893 F*kanlA. Expression clones encoding 
XGLUases were identified and repurified fi-om Ml ITL, OC1/4V, Pyrococcus furiosus VCl, 
15 Staphylothemus marinus FL Thermococcus 9N-2, Thermotoga maritima MSB8, 

Thermococcus alcaliphilus AEDni2RA, and Thermococcus chitonophagus GC74. 

Zibugen-(refereneed-in-MillerW^ 

p. 445.) 

per liter: 

20 Na,HPOr7H30 16.1g 

NaH3P04-7H.O 5.5g 

KCl 0.75g 

MgSO^-TH.O 0.246g 

P-mercaptoethanoi 2 .7ml 

25 Adjust pH to 7.0 

High Temperature Filter Assay 
(1 ) The f factor f kan (fiom £ coli strain CSH 1 1 8)( 1 ) was introduced into the pho-pnh- 
lac-strain BWl 4893(2). BWl 3893(2). The filamentous phage library was plated 
on the resulting strain, BWl 4893 F'kan. (Miller, J.H. (1992) A Short Course in 
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Bacterial Genetics; Lee, K-S., Metcalf, et ah, (1992) Evidence for two phosphonate 
degradative pathways in Enterobacter Aerogenes, J. Bacteriol., 174:2501-2510. 

(2) After growth on 100 mm LB plates containing 100 ^ig/ml ampicillin, 80 ng/ml 
nethicillin and ImM IPTG, colony lifts were performed using Millipore HATF 
membrane filters. 

(3) The colonies transferred to the filters were lysed with chloroform vapor in 1 50 mm 
glass petri dishes. 

(4) The filters were transferred to 100 mm glass petri dishes containing a piece of 
Whatman 3MM filter paper saturated with buffer. 

(a) when testing for galactosidase activity (XGALase), 3MM paper was 
saturated with Z buffer containing 1 mg/ml XGAL (ChemBridge 
Corporation). After transferring filter bearing lysed colonies to the glass 
petri dish, placed dish in oven at 80-85 °C. 

(b) when testing for glucosidase (XGLUase), 3MM paper was saturated 
with Z buffer containing 1 mg/ml XGLU. After transferring filter bearing 
lysed colonies to the glass petri dish, placed dish in oven at 80-85 °C. 

(5) 'Positives' were observed as blue spots on the filter membranes. Used the following 
filter rescue technique to retrieve plasmid fi-om lysed positive colony. Used pasteur 
pipette (or glass capillary tube) to core blue spots on the filter membrane. Placed 
the small filter disk in an Eppendorf tube containing 20 ^il water. Incubated the 
Eppendorf tube at 75 °C for 5 minutes followed by vortexing to elute plasmid DNA 
off filter. This DNA was transformed into electrocompetent E. coli cells DHIOB 
for Thermatoga maritima MSB8-6G, Staphylotheimus marinus F1-12G, 
Thermococcus AEDni2RA-18B/G, Thermococcus chitonophagus GC74-22G, 
Ml ITl and OC1/4V. Electrocompetent BW14893 F'kanl A E. coli were used for 
Thermococcus 9N2-31B/G, and Pyrococcus furiosus VC1-7G1 . Repeated filter-lift 
assay on transformation plates to identify 'positives'. Return transformation plates 
to 37''C incubator after filter lift to regenerate colonies. Inoculate 3 ml LB liquid 
containing 100 ng/ml ampicillin with repurified positives and incubate at 37°C 
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overnight. Isolate plasmid DNA from these cuhures and sequence plasmid insert. 
In some instances where the plates used for the initial colony lifts contained non- 
confluent colonies, a specific colony corresponding to a blue spot on the filter could 
be identified on a regenerated plate and repurified directly, instead of using the filter 
5 rescue technique. 

Another example of such an assay is a variation of the high temperature filter assay 
wherein colony-laden filters are heat-killed at different temperatures (for example, lOS^'C 
for 20 minutes) to monitor thermostability. The 3MM paper is saturated with different 
buffers (i.e., 1 00 mM NaCl, 5 mM MgCU, 1 00 mM Tris-Cl (pH 9.5)) to determine enzyme 
10 activity under difFerent buffer conditions. 

A P-glucosidase assay may also be employed, wherein GlcpPNp is used as an 
artificial substrate (aryl-P-glucosidase). The increase in absorbance at 405 nm as a result 
of p-nitrophenol (pNp) liberation was followed on a Hitachi U-1 100 spectrophotometer, 
equipped with a thermostatted cuvette holder. The assays may be performed at 80 °C or 
15 90°C in closed 1-mI quartz cuvette. A standard reaction mixture contains 150 mM 

trisodium substrate, pH 5.0 (at 80 °C), and 0.95 mVI pNp derivative pNp = 0.561 mM'* cm" 

— The-reaction-mixture"is~allowed"to-reaeh~the-desired-temperaturef-after--whieh-the- 

reaction is started by injecting an appropriate amount of enzyme (L06 ml final volume). 
I U P-glucosidase activity is defined as that amount required to catalyze the 
20 formation of 1 .0 ^mol pNp/min. D-cellobiose may also be used as a substrate. 

An ONPG assay for p-galactosidase activity is described by Miller, J.H. (1992) A 
Short Course in Bacterial Genetics and Mill, J.H. (1992) Experiments in Molecular 
Genetics, the contents of which are hereby incorporated by reference in their entirety. 

A quantitative fluorometric assay for p-galactosidase specific activity is described 
25 by : Youngman P., (1987) Plasmid Vectors for Recovering and Exploiting Tn917 

Transpositions in Bacillus and other Gram-Positive Bacteria. In Plasmids: A Practical 
approach (ed. K. Hardy) pp 79-1 03. IRL Press, Oxford. A description of the procedure can 
be found in Miller (1992) p. 75-77, the contents of which are incorporated by reference 
herein in their entirety. 
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The polynucleotides of the present invention may be in the form of DNA which 
DNA includes cDNA, genomic DNA, and synthetic DNA. The DNA may be double- 
stranded or single-stranded, and if single stranded may be the coding strand or non-coding 
(anti-sense) strand. The coding sequences which encodes the mature enzymes may be 
identical to the coding sequences shown in Figures 1-8 (SEQ ID NOS: 1-14 and 57-60) or 
may be a different coding sequence which coding sequence, as a resuh of the redundancy 
or degeneracy of the genetic code, encodes the same mature enzymes as the DNA of 
Figures 1-18 (SEQ ID NOS: 1-14 and 57-60). 

The polynucleotide which encodes for the mature enzyme of Figures 1-18 (SEQ ID 
NOS: 15-28 and 61-64) may include, but is not limited to: only the coding sequence for the 
mature enzyme: the coding sequence for the mature enzyme and additional coding sequence 
such as a leader sequence or a proprotein sequence; the coding sequence for the mature 
enz>'me (and optionally additional coding sequence) and non-coding sequence, such as 
introns or non-coding sequence 5' and/or 3' of the coding sequence for the mature enzyme. 

Thus, the term "polynucleotide encoding an enzyme (protein)" encompasses a 
polynucleotide which includes only coding sequence for the enzyme as well as a 
polynucleotide which includes additional coding and/or non-coding sequence. 

The present invention further relates to variants of the hereinabove described 
polynucleotides which encode for fiagments, analogs and derivatives of the enzymes having 
the deduced amino acid sequences of Figures 1-18 (SEQ ID NOS: 15-28 and 61-64). The 
variant of the polynucleotide may be a naturally occurring allelic variant of the 
polynucleotide or a non-naturally occurring variant of the polynucleotide. 

Thus, the present invention includes polynucleotides encoding the same mature 
enzymes as shown in Figures 1-18 (SEQ ID NOS: 15-28 and 61-64) as well as variants of 
such polynucleotides which variants encode for a fragment, derivative or analog of the 
enzymes of Figures 1-18 (SEQ ID NOS: 15-28 and 61-64). Such nucleotide variants 
include deletion variants, substitution variants and addition or insertion variants. 

As hereinabove indicated, the polynucleotides may have a coding sequence which 
is a naturally occurring allelic variant of the coding sequences shown in Figures 1-18 (SEQ 
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ID NOS: 1-14 and 57-60). As known in the art, an allelic variant is an alternate form of a 
polynucleotide sequence which may have a substitution, deletion or addition of one or more 
nucleotides, which does not substantially alter the function of the encoded enzyme. 

Fragments of the fiill length gene of the present invention may be used as a 
hybridization probe for a cDNA or a genomic library to isolate the full length DNA and to 
isolate other DNAs which have a high sequence similarity to the gene or similar biological 
activity. Probes of this type preferably have at least 10, preferably at least 15, and even 
more preferably at least 30 bases and may contain, for example, at least 50 or more bases. 
The probe may also be used to identify a DNA clone corresponding to a full length 
transcript and a genomic clone or clones that contain the complete gene including 
regulatory and promotor regions, exons, and introns. An example of a screen comprises 
isolating the coding region of the gene by using the known DNA sequence to synthesize an 
oligonucleotide probe. Labeled oligonucleotides having a sequence complementary to that 
of the gene of the present invention are used to screen a library of genomic DNA to 
determine which members of the library the probe hybridizes to. 

The present invention further relates to polynucleotides which hybridize to the 
-hereinabove-described sequences if there-is at least-70%,-preferably-at4east-90%,~and-mor^ 
preferably at least 95% identity between the sequences. The present invention particularly 
relates to polynucleotides which hybridize under stringent conditions to the hereinabove- 
described polynucleotides. As herein used, the term "stringent conditions" means 
hybridization will occur only if there is at least 95% and preferably at least 97% identity 
between the sequences. The polynucleotides which hybridize to the hereinabove described 
polynucleotides in a preferred embodiment encode enzymes which either retain 
substantially the same biological function or activity as the mature enzyme encoded by the 
DNA of Figures 1-18 (SEQ ID NOS: 1-14 and 57-60). 

Alternatively, the polynucleotide may have at least 15 bases, preferably at least 30 
bases, and more preferably at least 50 bases which hybridize to any part of a polynucleotide 
of the present invention and which has an identity thereto, as hereinabove described, and 
which may or may not retain activity. For example, such polynucleotides may be employed 
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as probes for the polynucleotides of SEQ ID NOS: 1-14 and 57-60, for example, for 
recovery of the polynucleotide or as a diagnostic probe or as a PCR primer. 

Thus, the present invention is directed to polynucleotides having at least a 70% 
identity, preferably at least 90% identity and more preferably at least a 95% identity to a 
polynucleotide which encodes the enzymes of SEQ ID NOS: 15-28 and 61-64 as well as 
fragments thereof, which fragments have at least 1 5 bases, preferably at least 30 bases and 
most preferably at least 50 bases, which fragments are at least 90% identical, preferably at 
least 95% identical and most preferably at least 97% identical under stringent conditions 
to any portion of a polynucleotide of the present invention. 

The present invention further relates to enzymes which have the deduced amino acid 
sequences of Figures 1-18 (SEQ ID NOS: 15-28 and 61-64) as well as fragments, analogs 

and derivatives of such enzyme. 

The terms "fragment," "derivative" and "analog" when referring to the enzymes of 
Figures 1-18 (SEQ ID NOS: 15-28 and 61-64) means enzymes which retain essentially the 
same biological function or activity as such enzymes. Thus, an analog includes a proprotein 
which can be activated by cleavage of the proprotein portion to produce an active mature 
enzyme. 

The enzymes of the present invention may be a recombinant enzyme, a natural 
enzyme or a synthetic enzyme, preferably a recombinant enzyme. 

The fragment, derivative or analog of the enzymes of Figures 1-18 (SEQ ID NOS: 
15-28 and 61-64) may be (i) one in which one or more of the amino acid residues are 
substituted with a conserved or non-conserved amino acid residue (preferably a conserved 
amino acid residue) and such substituted amino acid residue may or may not be one 
encoded by the genetic code, or (ii) one in which one or more of the amino acid residues 
includes a substituent group, or (iii) one in which the mature enzyme is fused with another 
compound, such as a compound to increase the half-life of the enzyme (for example, 
polyethylene glycol), or (iv) one in which the additional amino acids are fused to the mature 
enzyme, such as a leader or secretory sequence or a sequence which is employed for 
purification of the mature enzyme or a proprotein sequence. Such fragments, derivatives 
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and analogs are deemed to be within the scope of those skilled in the art from the teachings 
herein. 

The enzymes and polynucleotides of the present invention are preferably provided 
in an isolated form, and preferably are purified to homogeneity. 

5 The term "isolated" means that the material is renioved from its original 

environment (e.g., the natural environment if it is naturally occurring). For exeimple, a 
naturally-occurring polynucleotide or enzyme present in a living animal is not isolated, but 
the same polynucleotide or enzyme, separated from some or all of the coexisting materials 
in the natural system, is isolated. Such polynucleotides could be part of a vector and/or 

10 such polynucleotides or enzymes could be part of a composition, and still be isolated in that 

such vector or composition is not part of its natural environment. 

The enzymes of the present invention include the enzymes of SEQ ID NOS: 15-28 
and 6 1 -64 (in particular the mature enzyme) as well as enzymes which have at least 70% 
similarity (preferably at least 70% idenrity) to the enzymes of SEQ ID NOS: 1 5-28 and 6 1 - 

15 64 and more preferably at least 90% similarity (more preferably at least 90% identity) to 

the enzymes of SEQ ID NOS: 15-28 and 61-64 and still more preferably at least 95% 

siniilarity (still-more preferably-at4east-95%4dentity-)-to~the-enzymes ^ 

28 and 61-64 and also include portions of such enzymes with such portion of the enzyme 
generally containing at least 30 amino acids and more preferably at least 50 amino acids. 

20 As known in the art "similarity" between two enzymes is determined by comparing 

the amino acid sequence and its conserved amino acid substitutes of one enzyme to the 
sequence of a second enzyme. 

A variant, i.e. a "fragment", "analog" or "derivative" polypeptide, and reference 
polypeptide may differ in amino acid sequence by one or more substitutions, additions, 

25 deletions, fusions and truncations, which may be present in any combination. 

Among preferred variants are those that vary from a reference by conservative 
amino acid substitutions. Such substitutions are those that substitute a given amino acid in 
a polypeptide by another amino acid of like characteristics. Typically seen as conservative 
substitutions are the replacements, one for another, among the aliphatic amino acids Ala, 
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Val, Leu and He; interchange of the hydroxy! residues Ser and Thr, exchange of the acidic 
residues Asp and Glu, substitution between the amide residues Asn and Gin, exchange of 
the basic residues Lys and Arg and replacements among the aromatic residues Phe, Tyr. 

Most highly preferred are variants which retain the same biological function and 
activity as the reference polypeptide from which it varies. 

Fragments or portions of the enzymes of the present invention may be employed for 
producing the corresponding full-length enzyme by peptide synthesis; therefore, the 
fragments may be employed as intermediates for producing the full-length enzymes. 
Fragments or portions of the polynucleotides of the present invention may be used to 
synthesize full-length polynucleotides of the present invention. 

The present invention also relates to vectors which include polynucleotides of the 
present invention, host cells which are genetically engineered with vectors of the invention 
and the production of enzymes of the invention by recombinant techniques. 

Host cells are genetically engineered (transduced or transformed or transfected) with 
the vectors of this invention which may be, for example, a cloning vector or an expression 
vector. The vector may be, for example, in the form of a plasmid, a viral particle, a phage, 
etc. The engineered host cells can be cultured in conventional nutrient media modified as 
appropriate for activating promoters, selecting transformants or amplifying the genes of the 
present invention. The culture conditions, such as temperature, pH and the Hke, are those 
previously used with the host cell selected for expression, and will be apparent to the 
ordinarily skilled artisan. 

The polynucleotides of the present invention may be employed for producing 
enzymes by recombinant techniques. Thus, for example, the polynucleotide may be 
included in any one of a variety of expression vectors for expressing an enzyme. Such 
vectors include chromosomal, nonchromosomal and synthetic DNA sequences, e.g., 
derivatives of SV40; bacterial plasmids; phage DNA; baculovirus; yeast plasmids; vectors 
derived from combinations of plasmids and phage DNA, viral DNA such as vaccinia, 
adenovirus, fowl pox vims, and pseudorabies. However, any other vector may be used as 
long as it is replicable and viable in the host. 
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The appropriate DNA sequence may be inserted into the vector by a variety of 
procedures. In general, the DNA sequence is inserted into an appropriate restriction 
endonuclease site(s) by procedures knovm in the art. Such procedures and others are 
deemed to be within the scope of those skilled in the art. 

The DNA sequence in the expression vector is operatively linked to an appropriate 
expression control sequence(s) (promoter) to direct mRNA synthesis. As representative 
examples of such promoters, there may be mentioned: LTR or S V40 promoter, the E. coli. 
lac or trp, the phage lambda Pl promoter and other promoters knovra to control expression 
of genes in prokaryotic or eukaryotic cells or their viruses. The expression vector also 
contains a ribosome binding site for translation initiation and a transcription terminator. 
The vector may also include appropriate sequences for amplifying expression. 

In addition, the expression vectors preferably contain one or more selectable marker 
genes to provide a phenotypic trait for selection of transformed host cells such as 
dihydrofolate reductase or neomycin resistance for eukaryotic cell culture, or such as 
tetracycline or ampicillin resistance in E. coli . 

The vector containing the appropriate DNA sequence as hereinabove described, as 
well-as-an-appropriate-promoter-or-GontrGl-sequenee^may-be-employed-to-iransform-an — 
appropriate host to permit the host to express the protein. 

As representative examples of appropriate hosts, there may be mentioned: bacterial 
cells, such as E. coli . Streptomvces , Bacillus subtilis : fungal cells, such as yeast; insect cells 
such as Drosophila S2 and Spodoptera Sf9 ; animal cells such as CHO, COS or Bowes 
melanoma; adenoviruses; plant cells, etc. The selection of an appropriate host is deemed 
to be within the scope of those skilled in the art from the teachings herein. 

More particularly, the present invention also includes recombinant constructs 
comprising one or more of the sequences as broadly described above. The constructs 
comprise a vector, such as a plasmid or viral vector, into which a sequence of the invention 
has been inserted, in a forward or reverse orientation. In a preferred aspect of this 
embodiment, the construct further comprises regulatory sequences, including, for example, 
a promoter, operably linked to the sequence. Large numbers of suitable vectors and 
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promoters are known to those of skill in the art. and are commercially available. The 
following vectors are provided by way of example; Bacterial: pQE70, pQE60, pQE-9 
(Qiagen), pDlO, psiX174, pBluescript II KS, pNH8A, pNH16a, pNH18A, pNH46A 
(Stratagene); ptrc99a, pKK223-3, pKK233-3, pDR540, pRIT5 (Pharmacia); Eukaryotic: 
pSV2CAT, pOG44, pXTl, pSG (Stratagene) pSVK3, pBPV, pMSG, pSVL (Pharmacia). 
However, any other plasmid or vector may be used as long as they are replicable and viable 
in the host. 

Promoter regions can be selected from any desired gene using CAT 
(chloramphenicol transferase) vectors or other vectors with selectable markers. Two 
appropriate vectors are pKK232-8 and pCM7. Particular named bacterial promoters include 
lad, lacZ, T3, T7, gpt, lambda Pr, Pl and trp. Eukaryotic promoters include CMV 
immediate early, HSV thymidine kinase, early and late SV40, LTRs from retrovirus, and 
mouse metallothionein-1. Selection of the appropriate vector and promoter is well within 

the level of ordinary skill in the art. 

In a ftirther embodiment, the present invention relates to host cells containing the 
above-described constructs. The host cell can be a higher eukaryotic cell, such as a 
mammalian cell, or a lower eukaryotic cell, such as a yeast cell, or the host cell can be a 
prokaryotic cell, such as a bacterial cell. Introduction of the construct into the host cell can 
be effected by calcium phosphate transfection. DEAE-Dextran mediated transfection, or 
electroporation (Davis, L., Dibner, M., Battey, I., Basic Methods in Molecular Biology, 
(1986)). 

The constructs in host cells can be used in a conventional manner to produce the 
gene product encoded by the recombinant sequence. Alternatively, the enzymes of the 
invention can be synthetically produced by conventional peptide synthesizers. 

Mature proteins can be expressed in mammalian cells, yeast, bacteria, or other cells 
under the control of appropriate promoters. Cell-free translation systems can also be 
employed to produce such proteins using RNAs derived from the DNA constructs of the 
present invention. Appropriate cloning and expression vectors for use with prokaryotic and 
eukaryotic hosts are described by Sambrook, et al.. Molecular Cloning: A Laboratory 
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Manual, Second Edition, Cold Spring Harbor, N.Y., (1989), the disclosure of which is 
hereby incorporated by reference. 

Transcription of the DNA encoding the enzymes of the present invention by higher 
eukaryotes is increased by inserting an enhancer sequence into the vector. Enhancers are 
5 cis-acting elements of DNA, usually about from 10 to 300 bp that act on a promoter to 

increase its transcription. Examples include the SV40 enhancer on the late side of the 
replication origin bp 100 to 270, a cytomegalovirus early promoter enhancer, the polyoma 
enhancer on the late side of the replication origin, and adenovirus enhancers. 

Generally, recombinant expression vectors will include origins of replication and 
10 selectable markers permitting transformation of the host cell, e.g., the ampicillin resistance 

gene of E. coli and S. cerevisiae TRPl gene, and a promoter derived from a highly- 
expressed sene to direct transcription of a downstream structural sequence. Such promoters 
can be derived from operons encoding glycolytic enzymes such as 3 -phosphogly cerate 
kinase (PGK), a-factor, acid phosphatase, or heat shock proteins, among others. The 
15 heterologous structural sequence is assembled in appropriate phase with translation 

initiation and termination sequences, and preferably, a leader sequence capable of directing 

secretion ottramlaied-enzyme.-Optionally,Ahe„heterologous.sequencexan_encQd^^^ 

enzyme including an N-terminal identification peptide imparting desired characteristics, 
e.g., stabilization or simplified purification of expressed recombinant product. 
20 Useful expression vectors for bacterial use are constructed by inserting a structural 

DNA sequence encoding a desired protein together with suitable translation initiation and 
termination signals in operable reading phase with a ftinctional promoter. The vector will 
comprise one or more phenotypic selectable markers and an origin of replication to ensure 
maintenance of the vector and to, if desirable, provide amplification within the host. 
25 Suitable prokaryotic hosts for transformation include E. coli . Bacillus subtilis . Salmonella 

tvphimurium and various species within the genera Pseudomonas, Streptomyces, and 
Staphylococcus, although others may also be employed as a matter of choice. 

As a representative but nonlimiting example, useful expression vectors for bacterial 
use can comprise a selectable marker and bacterial origin of replication derived from 
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commercially available plasmids comprising genetic elements of the well known cloning 
vector pBR322 (ATCC 37017). Such commercial vectors include, for example, pKK223-3 
(Pharmacia Fine Chemicals, Uppsala. Sweden) and GEMl (Promega Biotec, Madison, Wl, 
USA). These pBR322 "backbone" sections are combined with an appropriate promoter and 
the structural sequence to be expressed. 

Following transformation of a suitable host strain and growth of the host strain to 
an appropriate cell density, the selected promoter is induced by appropriate means (e.g., 
temperature shift or chemical induction) and cells are cultured for an additional period. 

Cells are typically harvested by centrifiigation, disrupted by physical or chemical 
means, and the resulting crude extract retained for further purification. 

Microbial cells employed in expression of proteins can be disrupted by any 
convenient method, including freeze-thaw cycling, sonication, mechanical disruption, or 
use of cell lysing agents, such methods are well knovm to those skilled in the art. 

Various mammalian cell culture systems can also be employed to express 
recombinant protein. Examples of mammalian expression systems include the COS-7 lines 
of monkey kidney fibroblasts, described by Gluzman, Cell, 23:175 (1981), and other cell 
lines capable of expressing a compatible vector, for example, the CI 27, 3T3, CHO, HeLa 
and BHK cell lines. Mammalian expression vectors will comprise an origin of replication, 
a suitable promoter and enhancer, and also any necessary ribosome binding sites, 
polyadenylation site, splice donor and acceptor sites, transcriptional termination sequences, 
and 5' flanking nontranscribed sequences. DNA sequences derived from the SV40 splice, 
and polyadenylation sites may be used to provide the required nontranscribed genetic 
elements. 

The enzyme can be recovered and purified from recombinant cell cultures by 
methods including ammonium sulfate or ethanol precipitation, acid extraction, anion or 
cation exchange chromatography, phosphocellulose chromatography, hydrophobic 
interaction chromatography, affinity chromatography, hydroxylapatite chromatography and 
lectin chromatography. Protein refolding steps can be used, as necessary, in completing 
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configuration of the mature protein. Finally, high performance liquid chromatography 
(HPLC) can be employed for final purification steps. 

The enzymes of the present invention may be a naturally purified product, or a 
product of chemical synthetic procedures, or produced by recombinant techniques fi-om a 

5 prokaryotic or eukaryotic host (for example, by bacterial, yeast, higher plant, insect and 

mammalian cells in culture). Depending upon the host employed in a recombinant 
production procedure, the enzymes of the present invention may be glycosylated or may be 
non-glycosylated. Enzymes of the invention may or may not also include an initial 
methionine amino acid residue. 

10 P-galactosidase hydrolyzes lactose to galactose and glucose. Accordingly, the 

OC1/4V, 9N2-31B/a AEDII12RA-18B/G and F1-12G enzymes may be employed in the 
food processing industry for the production of low lactose content milk and for the 
production of galactose or glucose from lactose contained in whey obtained in a large 
amount as a by-product in the production of cheese. Generally, it is desired that enzymes 

15 used in food processing, such as the aforementioned P-galactosidases, be stable at elevated 

temperatures to help prevent microbial contamination. 

These-erizymes ma>ialso be-employed-inAhe pharmaceuticalJiidustry._The^^ 

are used to treat intolerance to lactose. In this case, a thermostable enzyme is desired, as 
well. Thermostable P-galactosidases also have uses in diagnostic applications, where they 

20 are employed as reporter molecules. 

Glucosidases act on soluble cellooligosaccharides from the non-reducing end to give 
glucose as the sole product. Glucanases (endo- and exo-) act in the depolymerization of 
cellulose, generating more non-reducing ends (endo-glucanases, for instance, act on internal 
linkages yielding cellobiose, glucose and cellooligosaccharides as products). p- 

25 glucosidases are used in applications where glucose is the desired product. Accordingly, 

MllTL, F1-12G, GC74-22G, MSB8-6G , OC1/4V, VC1-7G1, 9N2-31B/G and 
AEDni2RAl 8B/G may be employed in a wide variety of industrial applications, including 
in com wet milling for the separation of starch and gluten, in the fruit industry for 
clarification and equipment mainteucince, in baking for viscosity reduction, in the textile 
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industry for the processing of blue jeans, and in the detergent industry as an additive. For 
these and other applications, thermostable enzymes are desirable. 

Antibodies generated against the enzymes corresponding to a sequence of the 
present invention can be obtained by direct injection of the enzymes into an animal or by 
administering the enzymes to an animal, preferably a nonhuman. The antibody so obtained 
will then bind the enzymes itself. In this manner, even a sequence encoding only a 
fragment of the enzymes can be used to generate antibodies binding the whole native 
enzymes. Such antibodies can then be used to isolate the enzyme from cells expressing that 
enzyme. 

For preparation of monoclonal antibodies, any technique which provides antibodies 
produced by continuous cell line cultures can be used. Examples include the hybridoma 
technique (Kohler and Milstein, 1975, Nature, 256:495-497), the trioma technique, the 
human B-cell hybridoma technique (Kozbor et al., 1 983, Immunology Today 4:72), and the 
EBV-hybridoma technique to produce human monoclonal antibodies (Cole, et al., 1985, m 
Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). 

Techniques described for the production of single chain antibodies (U.S. Patent 
4,946.778) can be adapted to produce single chain antibodies to inununogenic enzyme 
products of this invention. Also, transgenic mice may be used to express humanized 
antibodies to immunogenic enzyme products of this invention. 

Antibodies generated against the enzyme of the present invention may be used in 
screening for similar enzymes from other organisms and samples. Such screening 
techniques are known in the art, for example, one such screening assay is described in 
"Methods for Measuring Cellulase Activities", Methods in enzymology. Vol 160, pp. 87- 
1 16, which is hereby incorporated by reference in its entirety. 

The present invention will be further described with reference to the following 
examples; however, it is to be understood that the present invention is not limited to such 
examples. All parts or amounts, unless otherwise specified, are by weight. 

In order to facilitate understanding of the following examples certain frequently 
occurring methods and/or terms will be described. 
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"Plasmids" are designated by a lower case p preceded and/or followed by capital 
letters and/or numbers. The starting plasmids herein are either commercially available, 
publicly available on an unrestricted basis, or can be constructed from available plasmids 
in accord with published procedures. In addition, equivalent plasmids to those described 
are known in the art and will be apparent to the ordinarily skilled artisan. 

"Digestion" of DNA refers to catalytic cleavage of the DNA with a restriction 
enzyme that acts only at certain sequences in the DNA. The various restriction enzymes 
used herein are commercially available and their reaction conditions, cofactors and other 
requirements were used as would be known to the ordinarily skilled artisan. For analytical 
purposes, typically 1 fig of plasmid or DNA fragment is used with about 2 units of enzyme 
in about 20 |il of buffer solution. For the purpose of isolating DNA fragments for plasmid 
construction, typically 5 to 50 pg of DNA are digested with 20 to 250 units of enzyme in 
a larger volume. Appropriate buffers and substrate amounts for particular restriction 
enzymes are specified by the manufacturer. Incubation times of about 1 hour at 37°C are 
ordinarily used, but may vary in accordance with the supplier's instructions. After digestion 
the reaction is electrophoresed directly on a polyacryiamide gel to isolate the desired 
fragment 

Size separation of the cleaved fragments is performed using 8 percent 
polyacryiamide gel described by GoeddeL D. et aL, Nucleic Acids Res., 8:4057 (1980). 

"Oligonucleotides" refers to either a single stranded polydeoxynucleotide or two 
complementary polydeoxynucleotide strands which may be chemically synthesized. Such 
synthetic oligonucleotides have no 5' phosphate and thus will not ligate to another 
oligonucleotide without adding a phosphate with an ATP in the presence of a kinase. A 
synthetic oligonucleotide will ligate to a fragment that has not been dephosphorylated. 

"Ligation" refers to the process of forming phosphodiester bonds between two 
double stranded nucleic acid fragments (Maniatis, T., et al.. Id., p. 146). Unless otherwise 
provided, ligation may be accomplished using known buffers and conditions with 1 0 units 
of T4 DNA ligase ("ligase") per 0.5 ^g of approximately equimolar amounts of the DNA 
fragments to be ligated. 
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Unless otherwise stated, transformation was performed as described in the method 
of Graham, F. and Van der Eb, A., Virology, 52:456-457 (1973). 

Kxample 1 

Racterial Expro'^^inn and Pn rification of GlycQsidase Enzvmes 
DNA encoding the enzymes of the present invention, SEQ ID NOS: 1-14 and 57-60 
were initially amplified from a pBluescript vector containing the DNA by the PGR 
technique using the primers noted herein. The amplified sequences were then inserted into 
the respective PQE vector listed beneath the primer sequences, and the enzyme was 
expressed according to the protocols set forth herein. The 5' and 3' primer sequences for 
the respective genes are as follows: 

Thermococcus AEDII12RA -1 8B/G 

5' CCGAGAATrCATTAAAGAGGAGAAATrAACTATGGTGAATGCTATGATTGTC 3' (SEQ ID NO:29) 
3- CGGAAGATCTTCATAGCrCCGGAAGCCCATA 5' (SEQ ID NO:30) 

Vector: pQE12; and contains the following restriction enzyme sites 5' EcoRI and 3' Big 
U. 



MTAAAGAGGAGAAATTAACTATGATAAGAAGGTCCGATTrrCC 3' 



OC1/4V-33B/G 
5' CCGAGAATTCA1 
(SEQ IDNO:31) 

3- CGGAAGATCnTAAGATnTAGAAATTCCTr 5' (SEQ ID NO:32) 

Vector: pQE12; and contains the following restriction enzyme sites 5' EcoRl and 3' Bgl 
II. 



XTTAAAGAGGAGAAATrAACTATGCTACCAGAAGGCnTCTC 3" 



Thermococcus 9N2 - 31B/G 

5' CCGAGAATTCAl 
(SEQ IDNO:33) 

3- CGGAGGTACCTCACCCAAGTCCGAACTTCTC 5' (SEQ ID NO-34) 

Vector: pQE30; and contains the following restriction enzyme sites 5' EcoRI and 3' 
KpnI. 
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Staphylothermus marinusYX - 12G 

5' CCGAGAATTCATTAAAGAGGAGAAATTAACTATGATAAGGrTTCCTGATTAT 3* 
(SEQ IDNO:35) 

3' CGGAAGATCTITATTCGAGGTTCTTTAATCC 5' (SEQ ID NO:36) 

Vector: pQEI2; and contains the following restriction enzyme sites 5' EcoRI and 3* Bgl 
11. 

Thermococciis chitonophagus GC74 - 22G 

S CCGAGAATTCATTCATTAAAGAGGAGAAATTAACTATGCTrCCAGGAGAACrrrCTC 3' 
(SEP ID NO:37) 

3* CGGAGGATCCCTACCCCTCCTCTAAGATCTC 5' (SEQ ID NO:38) 

Vector: pQE12; and contains the following restriction enzyme sites 5* EcoRI and 3' 
BamHL 

MllTL 

5" AATAATCTAGAGCATGCAATTCCCCAAAGACTTCATGATAG 3* (SEQ ID NO:39) 
3' AATAAAAGCTTACTGGATCAGTGT.AAGATGCT 5* (SEQ ID NO:40) 

Vector: pQE70; and contains the following restriction enzyme sites 5* SphI and 3* Hind 

m 

Thermotoga maritima MSB8-6G 

5' CCGACAATTGATTAAAGAGGAGAAATTAACTATGGAAAGGATCGATGAAATT 3* (SEQ ID NO:4I) 
3' CGGAGGTACCTCATGGTrrGAATCTCTTCTC 5' (SEQ ID NO:42) 

Vector: pQE12; and contains the following restriction enzyme sites 5' EcoRI and 3' 
KpnI. 

Pyrococcus furiosiis VCl - 7G1 

5- CCGACAATTGATrAAAGAGGAGAAATrAACTATGTrCCCTGAAAAGTTCCTT 3' (SEQ ID NO:43) 
3- CGGAGGTACCTCATCCCCTCAGCAATTCCTC 5* (SEQ ID NO:44) 

Vector: pQE12; and contains the following restriction enzyme sites 5' EcoRI and 3' Kpn 
L 
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Bankia gouldi endoglucanase (3 7GP 1 ) 

5- AATAAGGATCCGTITAGCGACGCTCGC 3' (SEQ ID NO:45) 

3- AATAAAAGCTrCCGGGTTGTACAGCGGTAATAGGC 5' (SEQ ID NO:46) 

Vector: pQE52; and contains the following restriction enzyme sites 5" Bam HI and 3* 
Hind m. 

Thermotoga maritima a-galactosidase (6GC2) 

5- nTATTGAATTCATTAAAGAGGAGAAATTAACTATGATCTGTGTGGAAATATTCGGAAAG 3' 
(SEQlDNO:47) 

3- TCTATAAAGCTn-CATrCTCTCTCACCCTCTTCGTAGAAG (SEQ ID NO:48) 

Vector: pQET; and contains the following restriction enzyme sites 5" EcoRI and 3' Hind 

in. 



Thermotoga maritima B-mannanase (6GP2) 

5- TrTATTCAATTGATrAAAGAGGAGAAATTAACTATGGGGATTGGTGGCGACGAC 3" 
(SEQlDNO:49) 

3- -nTATrAAGCTTATCmTCATATTCACATACCTCC 5' (SEQ ID NO:50) 

Vector: pQEt; and contains the following restriction enzyme sites 5' Hind 
EcoRI. 



AEPIIla B-mannanase (63GB 1) 

5- -nTATTGAATTCATrAAAGAGGAGAAAITAACTATGCTACCAGAAGAGTrCCTATGGGGC 3' 

20 (SEQIDNO:3l) 

3- nTArrAAGCrrCrCATCAACGGCTATGGTCTrCATnrC 5' (SEQ ID NO:52) 

Vector: pQEt; and contains the following restriction enzyme sites 5' Hind HI a 
EcoRI. 



OC1/4V endoglucanase (33GP1) 

3-AAAAAACAATrGAATrCATTAAAGAGGAGAAATTAACTATGGTAGAAAGACACTTCAGATATGTTCTT 
3- (SEQIDNO:53) 

3- TTTTTCGGATCCAATrcrrrCATITACTCrrrrGCCTG 5- (SEQ ID NO:54) 
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Vector: pQEt; and contains the following restriction enzyme sites 5* BamHI and 3' 
EcoRI, 

Thermotoga maritima pullalanase (6GP3) 

5' TTTTGGAATTCATT.VWVGAGGAGAAATTAACTATGGAACTGATCATAGAAGGTTAC 3* 
5 (SEQ ID NO:55) 

3* ATAAGAAGCTTTTCACTCTCTGTACAGAACGTACGC 5* (SEQ ID NO:56) 

Vector: pQEt: and contains the following restriction enzyme sites 5' EcoRI and 3' Hind 

ra. 

The restriction enzyme sites indicated correspond to the restriction enzyme sites on 
10 the bacterial expression vector indicated for the respective gene (Qiagen, Inc. Chatsworth, 

CA). The pQE vector encodes antibiotic resistance (Amp'), a bacterial origin of replication 
(on), an DPTG-regulatable promoter operator (P/O), a ribosome binding site (RBS), a 6-His 
tag and restriction enzyme sites. 

The pQE vector was digested with the restriction enzymes indicated. The amplified 
15 sequences were ligated into the respective pQE vector and inserted in frame with the 

sequence encoding for the-RBS.— The iigation-mixture-was then-used-to-tTansfonn-t^ 

strain M15/pREP4 (Qiagen, Inc.) by electroporation. Ml 5/pREP4 contains multiple copies 
of the plasmid pREP4, which expresses the lad repressor and also confers kanamycin 
resistance (Kan^. Transformants were identified by their ability to grow^ on LB plates and 
20 ampicillin/kanamycin resistant colonies were selected. Plasmid DNA was isolated and 

confirmed by restriction analysis. Clones containing the desired constructs were grown 
overnight (O/N) in liquid culture in LB media supplemented with both Amp (100 ug/ml) 
and Kan (25 ug/ml). The O/N culture was used to inoculate a large culture at a ratio of 
1 :1 00 to 1 :250. The cells were grown to an optical density 600 (O.D.^*^) of between 0.4 and 
25 0.6. IPTG ("Isopropyl-B-D-thiogalacto pyranoside") was then added to a final 

concentration of 1 mM. IPTG induces by inactivating the lad repressor, clearing the P/O 
leading to increased gene expression. Cells were grown an extra 3 to 4 hours. Cells were 
then harvested by centrifugation. 
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The primer sequences set out above may also be employed to isolate the target gene 
from the deposited material by hybridization techniques described above. 

Example 2 

Isolation of A Selected Clone From the Depo sited genomic clones 

A clone is isolated directly by screening the deposited material using the 
oligonucleotide primers set forth in Example 1 for the particular gene desired to be 
isolated. The specific oligonucleotides are synthesized using an Applied Biosystems 
DNA synthesizer. The oligonucleotides are labeled with "P- -ATP using T4 
polynucleotide kinase and purified according to a standard protocol (Maniatis et al.. 
Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Press, Cold Spring, NY, 
1982). The deposited clones in the pBluescript vectors may be employed to transfonn 
bacterial hosts which are then plated on 1 .5% agar plates to the density of 20,000- 
50,000 pfu/1 50 mm plate. These plates are screened using Nylon membranes according 
to the standard screening protocol (Stratagene, 1993). Specifically, the Nylon 
membrane with denatured and fixed DNA is prehybridized in 6 x SSC, 20 mM 
NaH,P04, 0.4%SDS, 5 x Denhardt's 500 \xg/m\ denatured, sonicated salmon sperm 
DNA; and 6 x SSC, 0.1% SDS. After one hour of prehybridization, the membrane is 
hybridized with hybridization buffer 6xSSC, 20 mM NaH.PO,, 0.4%SDS- 500 ug/ml 
denamred, sonicated salmon sperm DNA with 1x10* cpm/ml '-P-probe ovemight at 
42°C. The membrane is washed at 45-50°C with washing buffer 6 x SSC, 0.1% SDS 
for 20-30 minutes dried and exposed to Kodak X-ray film ovemight. Positive clones are 
isolated and purified by secondary and tertiary screening. The purified clone is 
sequenced to verify its identity to the primer sequence. 

Once the clone is isolated, the two oligonucleotide primers corresponding to the 
gene of interest are used to amplify the gene firom the deposited material. A polymerase 
chain reaction is carried out in 25 ^1 of reaction mixture with 0.5 ug of the DNA of the 
gene of interest. The reaction mixture is 1 .5-5 mM MgCU, 0.01% (w/v) gelatin, 20 jiM 
each of dATP, dCTP, dGTP, dTTP, 25 pmol of each primer and 0.25 Unit of Taq 
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polymerase. Thirty five cycles of PCR (denaturation at 94 °C for 1 min; annealing at 
SS'^C for 1 min; elongation at 72°C for 1 min) are performed with the Perkin-Elmer 
Cetus automated thermal cycler. The amplified product is analyzed by agarose gel 
electrophoresis and the DNA band with expected molecular weight is excised and 
purified. The PCR product is verified to be the gene of interest by subcloning and 
sequencing the DNA product. The ends of the newly purified genes are nucleotide 
sequenced to identify full length sequences. Complete sequencing of hill length genes is 
then performed by Exonuclease III digestion or primer walking. 

Example 3 
Screening for Galactosidase Activity 

Screening procedures for a-galactosidase protein activity may be assayed for as 
follows: 

Substrate plates were provided by a standard plating procedure. Dilute XLl- 
Blue MRF E coli host of (Stratagene Cloning Systems, La Jolla, CA) to CD-^oo ~ 1 -0 
with NZY media. In 1 5 mi tubes, inoculate 200 diluted host cells with phage. Mix 

_gently„andJncubateJubes atJ7^C_for_15„min._Add.apprQximately3_.5-mLLB_t^^ 

agarose (0.7%) containing ImM IPTG to each tube and pour onto all NYZ plate surface. 
Allow to cool and incubate at 37 °C ovemight. The assay plates are obtained as 
substrate p-Nitrophenyl a-galactosidase (Sigma) (200 mg/100 ml) (100 mM NaCI, 100 
mM Potassium-Phosphate) 1% (w/v) agarose. The plaques are overlayed with 
nitrocellulose and incubated at 4 ""C for 30 minutes whereupon the nitrocellulose is 
removed and overlayed onto the substrate plates. The substrate plates are then incubated 
at 70 °Cfor20 minutes. 
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Example 4 

Screening of Clones for Mann anase Activity 

A solid phase screening assay was utilized as a primary screening method to test 

clones for 6-mannanase activity. 

A culture solution of the Y1090-£. coli host strain (Stratagene Cloning Systems, 
La Jolla, CA) was diluted to O.D.eoo=l 0 with NZY media. The amplified library from 
Thermoioga maritima lambda gtll library was diluted in SM (phage dilution buffer): 5 
X 10^ pfu/jil diluted 1:1000 then 1:100 to 5 x 10"' pfu/^l. Then 8 ^il of phage dilution 
(5x10- pfti/^il) was plated in 200 \i\ host cells. They were then incubated in 1 5 ml 
tubes at 3 7 ° C for 1 5 minutes. 

Approximately 4 ml of molten, LB top agarose (0.7%) at approximately 52 °C 
was added to each tube and the mixture was poured onto the surface of LB agar plates. 
The agar plates were then incubated at 37 °C for five hours. The plates were replicated 
and induced with 10 mM IPTG-soaked Duralon-UV™ nylon membranes (Stratagene 
Cloning Systems, La Jolla, C A) overnight. The nylon membranes and plates were 
marked with a needle to keep their orientation and the nylon membranes were then 

removed and stored at 4 °C. 

An Azo-galactomannan overiay was applied to the LB plates containing the 
lambda plaques. The overiay contains 1% agarose, 50 mM potassium-phosphate buffer 
pH7,0.4% Azocarob-galactomannan. (Megazyme, Australia). The plates were 
incubated at 72 °C. The Azocarob-galactomannan treated plates were observed after 4 
hours then returned to incubation overnight. Putative positives were identified by 
clearing zones on the Azocarob-galactomannan plates. Two positive clones were 
observed. 

The nylon membranes referred to above, which correspond to the positive clones 
were retrieved, oriented over the plate and the portions matching the locations of the 
clearing zones for positive clones wre cut out. Phage was eluted from the membrane 
cut-out portions by soaking the individual portions in 500 ^xl SM (phage dilution buffer) 
and 25 ^l CHClj. 
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Example 5 

Screening of Clones for Mannosidase Activity 

A solid phase screening assay was utilized as a primary screening method to test 
clones for B-mannosidase activity. 

A culture solution of the Y1090-£. coli host strain (Stratagene Cloning Systems, 
La Jolla, CA) was diluted to O.D.^=l .0 with NZY media. The amplified library from 
AEPII la lambda gtll library was diluted in SM (phage dilution buffer): 5x10^ pfti/jal 
diluted 1 :1000 then 1:100 to 5 x 10" pfu/pl. Then 8 ^1 of phage dilution 
(5x10- pfu/pl) was plated in 200 \x\ host cells. They were then incubated in 15 ml 
tubes at 37 "^C for 15 minutes. 

Approximately 4 ml of molten, LB top agarose (0.7%) at approximately 52 °C 
was added to each tube and the mixture was poured onto the surface of LB agar plates. 
The agar plates were then incubated at 37 °C for five hours. The plates were replicated 
and induced with 1 0 mM IPTG-soaked Duralon-UV™ nylon membranes (Stratagene 
Cloning Systems, La Jolla, CA) overnight. The nylon membranes and plates were 
marked with a needle to keep their orientation and the nylon membranes were then 
removed and st ore d at 4 °C. 

A p-nitrophenyl-B-D-manno-pyranoside overlay was applied to the LB plates 
containing the lambda plaques. The overlay contains 1% agarose, 50 mM potassium- 
phosphate buffer pH 7, 0.4% p-nitrophenyl-6-D-manno-pyranoside. (Megazyme, 
Australia). The plates were incubated at 72 "^C. The p-nitrophenyl-B-D-manno- 
pyranoside treated plates were observed after 4 hours then returned to incubation 
overnight. Putative positives were identified by clearing zones on the p-nitrophenyl-B- 
D-manyno-pyranoside plates. Two positive clones were observed. 

The nylon membranes referred to above, which correspond to the positive clones 
were retrieved, oriented over the plate and the portions matching the locations of the 
clearing zones for positive clones v^e cut out. Phage was eluted from the membrane 
cut-out portions by soaking the individual portions in 500 jil SM (phage dilution buffer) 
and 25 ^1 CHCI3. 
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Example 6 
Screenint' for Pullulanase Activity 

Screening procedures for pullulanase protein activity may be assayed for as 
follows: 

Substrate plates were provided by a standard plating procedure. Host cells are 
diluted to O.D.600 = 1 0 with NZY or appropriate media. In 1 5 ml tubes, inoculate 200 
Hl diluted host cells with phage. Mix gently and incubate tubes at 37 °C for 15 min. 
Add approximately 3.5 ml LB top agarose (0.7%) is added to each tube and the mixture 
is plated, allowed to cool, and incubated at 37°C for about 28 hours. Overlays of 4.5 
mis of the following substrate are poured: 

100 ml total volume 

0.5g Red Pullulan Red (Megazyme, Australia) 

1 .Og Agarose 

5ml Buffer (Tris-HCL pH 7.2 @ 75 "C) 

2ml 5M NaCl 

5ml CaCU(lOOmM) 

85ml dHaO 
Plates are cooled at room temperaUire, and thenm incubated at 75 °C for 2 hours. 
Positives are observed as showing substrate degradation. 

Example 7 
Screening for Endogluc anase Activity 
Screening procedures for endoglucanase protein activity may be assayed for as 
follows: 

1 . The gene library is plated onto 6 LB/GelRite/0.1% CMC/NZY agar plates 
(-4,800 plaque forming units/plate) in E.coli host with LB agarose as top agarose. The 
plates are incubated at 37°C overnight. 
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2. Plates are chilled at 4°C for one hour. 

3. The plates are overlayed with Duralon membranes (Stratagene) at room 
temperature for one hour and the membranes are oriented and lifted off the plates and 
stored at 4''C. 

4. The top agarose layer is removed and plates are incubated at 37°C for ~3 

hours. 

5. The plate surface is rinsed with NaCl. 

6. The plate is stained with 0.1% Congo Red for 15 minutes. 

7. The plate is destained with 1 M NaCl. 

8. The putative positives identified on plate are isolated from the Duralon 
membrane (positives are identified by clearing zones around clones). The phage is 
eiuted from the membrane by incubating in SOOjil SM + 25|al CHCI3 to eiute. 

9. Insert DNA is subcloned into any appropriate cloning vector and 
subclones are reassayed for CMCase activity using the following protocol: 

i) Spin 1ml overnight miniprep of clone at maximum speed for 3 

minutes. 

ii) Decant^the^supematant.and use_it to_fill_lwells!Lthat_have-been 



made in an LB/GelRite/0.1% CMC plate. 

iii) Incubate at 37 °C for 2 hours. 

iv) Stain with 0.1% Congo Red for 1 5 minutes. 

v) Destain with IM NaCl for 15 minutes. 

vi) Identify positives by clearing zone around clone. 

Numerous modifications and variations of the present invention are possible in 
light of the above teachings and, therefore, within the scope of the appended claims, the 
invention may be practiced otherwise than as particularly described. 
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WHAT IS CI AIMED IS : 

1 . An isolated polynucleotide selected from the group consisting of: 

(a) SEQ IDNOS: 1-14 and 57-60; 

(b) SEQ ID NOS: 1 -1 4 and 57-60, wherein T can also be U; 

(c) polynucleotide sequences complementary to SEQ ID NOS: 1-14 and 57- 
60; 

(d) polynucleotide sequences which encode an amino acid sequence as set 
forth in SEQ ID NOS:15-28, and 61-64; and 

(e) fragments of (a), (b), (c) or (d) that are at least 1 5 consecutive bases in 
length and that will selectively hybridize to DNA which encodes a 
polypeptide of SEQ ID NOS: 1 5-28, and 61-64. 

2- A vector comprising a polynucleotide of claim 1 . 

3 . A host cell containing the vector of claim 2. 

4. The method of claim 3, wherein the host cell is a eukaryotic cell. 

5. The method of claim 3, wherein the host cell is a prokaiyotic cell. 

6. A method for producing a polypeptide comprising: 

(a) culturing the host cells of claim 3 ; 

(b) expressing from the host cell of claim 3 a polypeptide encoded by said 
polynucleotide; and 

(c) isolating the polypeptide. 
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7. An enzyme selected from the group consisting of: 

(a) an enzyme comprising an amino acid sequence set forth in SEQ ID NOS: 
15-28 or 61-64; and 

(b) an enzyme which comprises at least 30 consecutive amino acid residue as 
an enzyme of (a). 

8. An enzyme of which at least a portion is coded for by a polynucleotide of 
claim 1 , and which is selected from the group consisting of: 

(a) an enzyme comprising an amino acid sequence which is at least 70% 
identical to an amino acid sequence selected from the group of amino 
acid sequences set forth in SEQ ID NOS;15-28 or 61-64; and 

(b) an enzyme which comprises at least 30 amino acid residues to the 
enzyme of (a). 



9, A method for generating glucose from soluble cell oligosaccharides comprising 
contacting a sample containing oligosaccharides with an effective amount of an 

enyzme-selected-from-the-group-consisling-ofanenzyme-having-the-amino acid- 
sequence set forth in SEQ ID NOS: 15-28, 61-63 and 64 such that glucose is 
produced. 



10. The method of cliam 9, wherein the sample is selected from the group consisting 
of dairy products, fruit juices, detergents, textiles, guar gimi, animal feed, plant 

biomass and waste products. 

11. The method of claim 9, wherein the oligosaccheiride is selected from the group 
consisting of maltose, cellobiose, lactose, sucrose, raffinose, stachyose, 
verbascose, cellulose, starch, amylose, glycogen, disacharrides, polysacharrides 
and pullulan. 
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MllTL CLYCOSIDXSE - 29C 
COMPLETE GENE SEQUENCE - 9/9S 

I rn: aaa i-ri- . rc aaa kai* nv A:n. ata ta. t. a tt-t n'A .vi; i-rr . aa ttt .:aa n„ 

I M.M KVK p,.. ..^^^ ^.^^ ^^^^^ ^^^^ ^^^^^ .^^^ 



i»i i:crr att rti- (;fu; rrf tjvc ty^r (vi: aat ai.t ^^it ttzc rtu; trrA Ttx; uii; tat cat 



r t*rt; i;At: |,»ti 



1 HO 
**0 



4 80 
160 



600 
200 



660 



M C)y He Pro i:iy r,lu A>:p P. o Asn S.-. Asp Tip Trp | Trp VaI II, r. A.sp Mio .;| 

!:ri AAt: aca ia:A coa cta (rrr ai;i- cua- .iat ttt fret- ,:a.: aac wx' cca ccrr ta*- Ttu: aat 
4 1 A.nn Th, AU. Alv» G I y Leu Vol S,'r iMy A.np Pho Pro Clu Asn CJy Pro Gly Ty. t, p a..:o 

Jei TTA AAC CAA AAT CAC CAC CAC CTt: fXT CAC AAC CTG CCC CTT AAC ACT ATT ACA CPA .XX" 240 

61 Asn On A5n Asp His A.sp l.c». AM cMu Lys Leu Cly Val Asn Thr lie An, Vol CJy 80 

'fl! """^ '''^ ^^''^ '^'^ '''•'^ ACT TTC AAT CTT AAA Cn^ CCT CTA GAG ACA 300 

81 Val Clu Trp ser Arg lie Phe Pro Lys Pro Thr Phe Asn Val Lys v*l Pro v«l Clu Arg lol 

301 CAT CAC AAC CCC ACC ATT CTT CAC CTA CAT CTC CAT CAT AAA CCC CTT CAA ACA CTT CAT 

101 ASP Clu Asn Gly Ser He Val His Val Asp Val Asp Asp Lys Ala Val Clu Aro Leu Asp 120 

361 CAA TTA CCC AAC AAC CAC CCC CTA AAC CAT TAC GTA GAA ATG TAT AAA CAC TCC CTT GAA 420 

121 Clu Leu Ala Asn Lys Clu Ala Val Asn His Tyr Val clu MeC Tyr Lys Asp Tzp Val Clu ' 140 

421 ACA CCT ACA AAA CTT ATA CTC AAT TTA TAC CAT TCG CCC CTC CCT CTC TCG CTT CAC AAC 
14 1 Ara Cly Arg Lys Leu lie Leu Asn Leu Tyr His Trp Pro Leu Pro Leu Trp Leu His Asn 

4BI CCA ATC ATC CTC AGA ACA ATG CCC CCC i;AC ACA CCC CCC TCA CCC TCG CTT AAC GAG GAG 540 

161 Pro He Met Val Arg Arg Mec Cly Pro Asp Arg Ala Pro Ser Cly Trp Leu Asn Clu Clu 180 

541 TCC CTC CTC CAC TTT GCC AAA TAC CCC CCA TAC ATT CCT TCC AAA ATC CCC GAG CTA CCT 

181 Ser Val Val Clu Phe Ala Lys Tyr Ala Ala Tyr He Ala Trp Lys Mec Cly clu Leu Pro 

601 CTT ATC TCG ACC ACC ATG AAC GAA CCC AAC CTC CTT TAT GAG CAA CGA TAC ATG TTC GTT 

201 Val Met Trp Ser Thr Met Asn Clu Pro Asn Val Val Tyr Clu Cln Cly Tyr Mec Phe Val 220 

661 AAA CCC GCT TTC CCA CCC CCC TAC TTC ACT TTC CAA CCT CCT CAT AAC GCC ACC AGA AAT 720 

221 Lys Gly Cly Phe Pro Pro Cly Tyr Leu Ser Leu Clu Ala Ala Asp Lys Ala Arg Arg Asn 240 

721 ATC ATC CAC CCT CAT CCA CCC CCC TAT GAC AAT ATT AAA CCC TTC ACT AAC AAA CCT CTT 780 

241 Mec He Cln Ala His Ala Arg Ala Tyr Asp Asn He Lys Arg Phe Ser Lys Lys Pro Val 2 60 

781 CGA CTA ATA TAC CCT TTC CAA TCC TTC GAA CTA TTA GAC CCT CCA GCA GAA GTA TTT CAT 840 

261 Cly Leu He Tyr Ala Phe Gin Trp Phe Clu Leu Leu Clu Cly Pro Ala Clu Val Phe Asp 280 

841 AAC TTT AAC ACC TCT AAC TTA TAC TAT TTC ACA CAC ATA CTA TCC AAC CCT ACT TCA ATC 900 

281 Lys Phe Lys Ser Ser Lys Leu Tyr Tyr Phe Thr Asp He Val Ser Lys Gly Ser Ser He 300 

901 ATC AAT CTT CAA TAC ACC ACA GAT CTT GCC AAT ACC CTA GAC TCG TTC GGC CTT AAC TAC 960 

301 He Asn Val Clu Tyr Arg Arg Asp Leu Ala Asn Arg Leu Asp Trp Leu Cly Val Asn Tyr 32 0 

961 TAT ACC CCT TTA CTC TAC AAA ATC CTC CAT CAC AAA CCT ATA ATC CTC CAC CCC TAT CCA 1020 

321 Tyr Ser Arg Leu Val Tyr Lys He Val Asp Asp Lys Pro He He Leu His Gly Tyr Cly 340 

1021 TTC CTT TCT ACA CCT GGC CCC ATC ACC CCC GCT CAA AAT CCT TCT ACC CAT TTT CCC TCG 1080 

341 Phe Leu Cys Thr Pro Gly Gly He Ser Pro Ala Glu Asn Pro Cys Ser Asp Phe Cly Trp 360 

1081 GAG CTC TAT CCT CAA CCA CTC TAC CTA CTT CTA AAA CAA CTT TAC AAC CCA TAC CCC CTA 1140 

361 Clu Val Tyr Pro Clu Cly Leu Tyr Leu Leu Leu Lys Clu l^u Tyr Asn Arg Tyr Cly Val 380 

114 1 CAC TPC ATC CTC AlV GAG AAC CCT CTT TCA CAC ACC A<n; UAT CCC TTC ACA CCC CCA TAC 1200 

381 Asp Leu He Va > Thr Glw Asn Gly Val Ser Asp Ser Ara Asp Ala Leu Arg Pro Ala Tyr 40O 



1201 CTG CTC TCG CAT CTT TAC ACC CTA TC;C; AAA CCC GLT AAC rj\C CXKT ATT CCC CTC AAA GCC 



40) Ltni Val 



I2bt> 



•r Mis v.i! Ty» Ser Val Trp Lys Alo AU Ariii clu c:ly He Pro V.-»J Lys Gly 4:rO 

ljc»i TA<* CTi' rAC Tta: aci- rn; aca i;ai- aat tai* c:ac Tia: t^ii- i ac %;r,v tti- At:r: f.At; aaa Tn i t.Mi 
421 Tyr tu-x» Uir. Trp >;,., !,,.« Tlir AKp Arii, Tyr Cl.i T. p Al.. t:l„ t;|y Hi.. Arq Cl,i l.yr; Pl« .».u» 



Figure la. 
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bO 
20 



OCX/ 4 CLYC09ZDASC - 3 3C/B 
COMPLETE CENT SEQUENCE - 9/95 

; T. T, z TZ ^.zizz:^^, r z: z' z: z: - 
'-^ ^. t. t.: r z tz z ?s s rr. z z :z ir; ;r 
•'■ ^^^^,::r.zz: r ;^ z^.?r.zzz: ^. z z ;r 
;i ^ rj ;i: ~ ™ - ;t: » j:: i:: 



240 

ao 



360 
120 

420 
140 

480 

160 

540 
180 

. 600 
200 



IS s s :s ^.-^ s z^.:xzi^:zx .-^ s 2: s 
- ?s ™ - s: = s: o-s s s SI .-^ s 

661 AAC CTT CTG ATC AAA ATA GAA CCC CCC CAT CCA AAA CCC CAA ACT TTC T-G err nrx 
271 A.n Val Vel M.C Lys XI. Clu P.o Cly Asp Al. Lys Pro ^ ^ 

241 V.1 X.P Ly. Ph. Val Xsn AI. Trp s« Hi. Asp Pro Val ^ ^ ^ ^ ^ J ^ 

u Clu Al. val Al* L.U Tyr Thr Clu Ly$ Cly L«u Cln Val L«u Asp Ser Asp Het Asn 

28! lH ill *^ ^ ^ TAC ACA AGA ACA err GTT err 

He II* ser Thr Pro II. Asp Ph. Ph. Cly Val Asn Tyr Tyr Thr Arg Thr Leu Val Val 

301 IT ^""^ ^ ^ ^ «^ GAC CTT ACC GAG 

301 Ph« A.p H.t Asn Asn Pro L.u Cly Ph. S.r Tyr Val Cl„ Cly Asp ^ p^ l^t 

III mI^ ^ ''^'^ ^ ^^'^ CCA TTT CAT ATC CTC CTC TAT CTC AAC GAA AGA 

321 Met Cly Trp Clu II. Tyr Pro Gin Cly Lou Ph. Asp H.t Leu Val ^ ^ ^ 

1021 TAT AAA CTA CCA CTT TAT ATC ACA GAC AAC CCC ATC COT CCA CCT CAT axa ™- n^. ..^ 
Tyr .ys Leu Tyr Xl. Thr Clu Asn Cly Hec A^ ^ ^ ^ ^ 

'35! ^ ^ Zl ^^"^ ATT CAA TAT TTC GAA AAC CAC TTT CAA AAA CCA CTT 

ly Ary Val Hxs Asp Asn Tyr Ary II. clu Tyr Leu Clu Lys His Phe Clu Lys Ala Leu 

^ ^a m A^ f ^ <=^'^ ^ AAA COT TAC TTC ATT TGG TCT TTC A^ CAT AAC 

Clu Ala a. Asn Ala Asp Val Asp Leu Lys Cly Tyr Phe He Trp Ser Leu Mec Asp Asn . 

' - 1 ^ ^ ^ AAA OCT TTC CCT ATA ATC TAC CTA CAT TAC AAT ACC 

i" Trp Ala t ys Cly Tyr Ser Lys Ary Phe Cly He He Tyr Val Asp Tyr Asn Thr 

Pro t^s r T ^ ^ AAC CAA TTT CTA AAA TCr TAA 1317 

Arg , x.ru Lys Asp Ser Ala H.c Trp Leu Lys c;lu rh« Leu Lys Ser F.,d 4*0 



720 
240 

780 
260 

840 
280 

900 
300 

960 
320 

1020 
340 

1080 
360 

1140 
380 

1200 
4 00 
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Figure 2 
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1 
1 

61 
21 



41 



61 



81 



; A-rf- i;ai: 


20 


Arq 


lliQ 
40 


CCT CAC 
Ala cju 


180 
60 


AAA CAT 


340 



301 
101 



n. 2: ^ ~; ;i: ^ - ™ ™ 

«T. C« Ci. „ „ "» — ^, 

... ... ... ir.s^^. - s :^ - ~ - — 

3S1 OCT CCA TCG ACT 

<21 TCC CAC ATA AAA c*r^ Tyr Val Clu Leu n« Ala 

■•■ - - ... S s .-^ ;rj s s s ;i: s; ;r; r r - 

<n CAA CC:A TAT ATT TCC CCC GAA ^ 

... ^ s - s s 51 s ;n 5: ^ s ^ - - - - 

5<1 CTA ACT AAO AAT CTT TTX x»» r- 

«1 AAT ATT TAT CAT AAA r,. ~ ^" Ar, cly Ly, A^p 33 

'21 AOG CCA CAA CTA CAA AC "'^ 
_^ A., c°?^ - g^-^-g^-^-^^- - - ^ gf^ A.._CAT_T^__.. 

'81 ATA CCC ATA AAC TX. 

... .1: s n; - - - - - - r r - — .« 

«<1 CAT ATT AAA c^r,. ^ ^" ""^^ 280 

- «. s; ^: s - ™ s s s - - - ™ - 

- ^. „. - - - s - - sj - - ~ - - x„ X. ,„ 

961 ATT ACA CAC AAC Crrr ^ ^20 

1021 CAC TTA rxA ^ ^^^'^ Arg 340 

- ^* :^ ^ - r t - 

■••I TOO xre rre XTO oxT _ " 

... o-;: - - - - .co oo. „. orx .... 

or, oxT TXT xxo .CT T„ 01, U.. .., ,,0 

... v.. ... T,. - - - - ™ - - ~. .xt „. .„ ex. 

>"< "X OCX cor «x xxo .„ ■='» ""' 

^261 CAA TAA J266 '^^'^ 
^21 cJu Efid 
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Cofr*t;.;ec* gene »e<)u*>ftce >/V5 



= ™ n;- it: is ^ - 



«0 
70 



€1 



131 



'^'^ AAC ATA AAC AOC CAA CTf* crrt' Ara^* rv#» #— 

... ... ... :;:r z tt. - - - ir^ t. - - 

Tjrx sxj i.y, A*p Ki. Arc teu A.a ajtb A-p i.u oly Leu Ann v»l Tyr Arg He 

^ s :s :5 s s J,— - s ^ s;".^ 
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431 



40 

ISO 

£0 

340 
80 

300 
100 

3 SO 
120 

420 
X40 

"1 TCX «K TTT OT ceo KW CTT ATC AAC Ce= CAB CCS GOV AAG CTC 

271 P„ Tvr S*r Cly Ph. Pro «y v., -,c a„ Pro al» ^ 

III CCA CTC CCC TAC A.V. ATO ATA AAS A^J TT= GAC A«J G5A AAC 780 

341 A.„ M.t 11. A.n XI. Hi, Ai. Leu Al, xyr ,.ys „„ U. ^y, ry. ^ ^ ^ 

2J1 Al. A,p Ly. A^ s„ ^9 ^„ elu AI. du V.1 cly II. il. Xy, A« A... ^ ^ " 

Aia Tyr ;.ro Tyx A.p s.x *.» A.p pro Ly, A.p v.1 i.y. AI. XI. Clu X«i A« ^ 



2SO 



a»i Ai. T-^ ^ r„ ;~ v:r ^ '•^ »oo 

'ol Ph. Hi, s.r r.ic rz: rr r*- ^« *^ «»<» cxc ^so 

330 



IT^ ^ ^ ™= *^ '^'^ c« «c XAc Mc cxo rrc cxc 

Ph. H., S,r Cly Ph. Fh- A-p Ala xl. Ki. Ly, Cly Ly, L«, x«tt li; «u ^ 
321 Cly Clu TKr Ph. v.l Ly. v.1 Ar, Ml. >^ Cly A.n A.» Trp ^ "o 



I'^ *^ *«0 <^ AAC TTC CCB XW 

J«l ry, Thx Arg Glu V.1 v.1 Ar, Tyr S.r CJ, ?ro Ly» Ph« Pro *S 51^ ^ Hr 



xoai 



3fil IT f ^ '^'^ '^'''^ TAC GCC TCC ACC CCC CCC ACT TCT TCC CCC cxc OCA 

3«1 Ph, Ar, Cly v.1 MX. a« Tyr Cly Tyx XI. Cy. Arg Pro Cly sax 5«r S.r XI. X.p ^y 

iBi Arg Fro V.1 Ser A*p il» cly Trp Clu 11, Tyr Pro Clu Cly 11. Tyr A»p s.r II, Arv 

"oJ Al* ^ *^ ACC CAA AAC CCX XTX CCC CAT TCX XCT 

Clu Al* KMn Ly. Tyr Cly V. I l^o u« I Tyr val Thr Clu A*a Cly Il« Ala Xcp S.r Thr 

if ^ '^'^ '''^ CAT CTX CCC AAO A-rr CAC CAC COS TAC CAS 

*21 A.P Thr L*u Arg Pro Tyr Tyr L.u Xl. Scr Hi, v.1 Al. i^y. la. clu Clu XI. Tyr Olu 
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Figure 4a 
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Figure 4V>(Continued) 
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I ATC GAA ACi; ATC GAT GAA ATT CTC TrT CAG TTa ACT ACA GAG GA/ 
I Met G»o Atft l)r Asp Cl» lie Utt Scr Cin l.co Thr Thr Clu Glu 

61 (JTC ccc crrr cgt ctt cca cga ctt ttt ccg aac cca cat tcc aca 

21 V*| Cly Val t;iy Ixu Pm Gly Lxu Phc Gly Aaii Pm His Scr Ar|t 

121 CCA CAA ACA CAT CCC CTT CCA ACA CTT GCA ATT CCT GCC TTT CTC 
4 1 C»y Glu Thr lies Pro V»| prii Arg Let* Gly Ik Pr« Ala phc Val 

IS I GCA CCA CTC ACA ATA AAT CCC ACA ACG CAA AAC GAT GAA AAC ACT 
61 Ata Oy Uu Aff lie Am Pro TTir Art Cio Am Asp Ctu Asn Thr 

241 TTT CCC CTT CAA ATC ATC CTC CCT TCT ACC TCC AAC ACA CAC CTT 
81 Phe Pro V»| Glo lie Mcl Utt Alt Scr Thf Trp Ami Arj Ajp Leu 

301 AAA CCC ATC CCA CAA CAA CTT AGO CAA TAC CCT CTC CAT CTO CTT 
101 Ly« AU Mci Gly Glu Glu Val Art Gin Tyr Cly Vil Aip V*| Lx« 

361 AAC ATT CAC ACA AAC CCT CTT TCT CGA ACC AAT TTC CAG TAC TAC 
121 Ain lie His Arj A$n Pro Utt Cys Ciy Arj Asn Phe Glu Tyr Tyr 

421 CTT TCC CCT CAA ATC CCT TCA CCC TTT GTC AAC CGA CTT CAA TCT 
Ui Utt Scr Cly Clu Met AU Scr AU Phe V»| Uyj Cly V»I Gin Scr 

4«I TCC ATA AAA CAC TTT GTC CCG AAC AAC CAG GAA ACC AAC ACC ATG 
161 Cy$ Jle Ly« His Phc V»i AU Asn Am Chi Clu Thr Asn Arj Met 

541 GTC TCC CAC CCA GCC CTC ACA CAA ATA TAT CTC AAA CCT TTT GAA 
Kl Vtl Set Clu Arj AU Uo Arg G)» tic Tyr Uu Lyi Cly Phc CId 

601 CCA ACA CCC TCG ACC GTC ATC ACC CCT TAC AAC AAA CTG AAT CGA 
201 AU Arj Pro Trp Thr Val Met Scr AU Tyr Am Ly* Uu Asn Cij 

661 AAC CAA TCG CTT TTC AAG AAC CTT CTC ACG CAA CAA TCG CCA TTT 
221 Ain Clu Trp Uu Uu Lys Lys V*| Uo Arj Clu Glu Trp Gly Phc 

721 ACC CAC TCG TAC CCG CGA CAC AAC CCT CTA GAA CAC CTC AAC CCC 
24 1 Scr Asp Trp Tyr AU Cly Asp A»o Pro Val Clu Gin Uo Lys Ate 

781 ATC CCT CCG AAA GCC TAT CAC CTC AAC ACA CAA ACA ACA CAT CAA 
261 Met Pro Cly Lys AU Tyr Gin Val Asn Thr Clu Arg Arg Asp Clu 

841 CAC GCC TTC AAC GAG CGA AAA TTC ACT GAG CAC CTT CTC CAT CAC 
2SI Clu AU U« Lys Clu Ciy Lys Lcti Ser Clu Clu Val Lea Asp Clu 

901 CTC AAA CTT CTT CTC AAC CCC CCT TCC TTC AAA CCG TAC ACC TAC 
301 Uu Lys V.I Uu V.I A-in AU Pro Scr Phe Lys Cly Tyr Arg Tyr 

961 CTC CAA TCT CAC CCC CAA CTC CCC TAC CAA CCA GCT GCC CAC CCT 
J^l Leu Clu Scr His AU Glu Val AU Tyr Clu AU Gly AU Clu Cly 

»02l AAC AAC CCT CTT CTT CCC TTC CAT CA- 
->-»l Asn Aw Gly Vj| Uu Pr« Phc Axp Clu 

ATC CAA ACA ATA AAC GCA CGA ACC CC> 
361 lie Clu Thr lie lys <;i, Cly Thr Cly 
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I2»l GA« GAG TAC ATA AAA A AG ATG ACa f** ai-a ^ , 

c. ry. ... r r ^.^ r ::r 

tlfti CCA ACC CTC ATA AAA CCC AAA CTC cCa rxr a at ^ 

*n c, v.. ... p„. crc Tc. o.. ... 

1321 CCT CCA AAC AAA AAC CAT CTT GCa rrr r-r-r /-r^ 

P... P.- C,. ^ -° -C .CT .CC .TC TCC 
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<fl«7 (coatiatt^d) {C Cr P X ) 
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1"» 1638 1647 1656 , 

- !! !!! !!! !!! ^ cTx^ cnc aaa'?^ 

He Glu Trp Xsn Gly Olu Val Gly III mI 'iZl cin Zeu Ila vl'l lyl Z~ 

^^"^ 1701 1710 171, 

CCC CCA AAG ^GC GAC TGG GAA G^ G^ AGA G^ ^o '^ ^^^^J^ 

Pro Gly Ly, Sex Asp Trp Glu Gli; I" ~ ^ --- 
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!^ ™ !!! ™ !^!! !^ cca aac gag gga^^ 

Ser Glu Cys Glu He Leu Qlu Tyr Zl nl HI ^ 'f^' ""^ ""^ 
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^ *!! ^ ^! ^" ^ «^ ACT TTc ggc^ 

Leu A^p Met A^n Asa Ala Asit Vol Glu Ser Ala gIu ill lie Pbe Gly cly 
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OC1/4V X&doalxiCU«»« (330P1) 

!!* !!! t"^. ™^ acc tJ4 err cm- ATC 

Hec V«l Glu Arg His Phe Arg Tyr 7.1 III I'll ^ 'r^^l ^ ;;;;; 



" 72 Bl 90 95 



108 
AAT 



!!!!!- !^!! !!! t!! !!!!!! '^^^ AAA 

t*u Leu lie Ser Ser Thr Gin Cys llylyl Z'n «« l7o Zn lyl Z'g 7^1 
"■^ 135 144 153 

-!!-!! ff3 ^ AAc r«:A GCA S GAA ill 

ser Met Glu Gin Ser Val Al* Glu Sel III IZ Zl IZ Glu ^ 

189 198 207 

^ -!!!!-!!! ^ ™ *" *^ «T ccT TTc ^ 

Lya Mec val Gly Lys Cly Val Asn He cly Iln Ila Zl gIu clu 
225 234 243 252 261 

!^ ™ !!! ™ !!* !!!!^ !^ ^ ''^^ ^ ^ ^-f* ^^-^ aag aaa aS 

Gly Ala Trp Gly Val Arg He Glu Asp Clu Tyr Pb« Glu lie III lyl lyl Z'g 
288 297 306 315 

!!-!!! ™ !!!!!! ^ ^!! *^ ""^ CAT ATA Tcc GAA AA^ 

Gly Phe A«p ser Val Arg He Pro He Arg Trp Ser Zl Zl Zl IZ gIu Zys 

"5 342 351 360 359 ,70 

CCA CCA TAT GAT ATT GAC AGG AAT TTC CTC GAA AGA CTT AAC CAT GTT GTC St 
•Pro Pro Tyr Asp He Asp Arg Asn Phe Leu Glu Arg vll Zl Zl vll vll Zl' 

^^"^ 395 405 414 421 

-!!!!!!!! ^! !!^! ^ ^""^ aat acg cac cat gaa Jii 

Arg Ala Leu Glu Asn Asa Leu Thr Val He He Zl Thr ^s ill; Zl gIu gIu 

*59 468 477 

CTC TAT CAA GAA CCG GAT AAA TAC GGC GAT GTT TTO GTG GAA ATT TCG AGA "g 



Leu Tyr Gin GIu Pro Asp hys Tyr Gly 



Asp Val Leu Val Glu lie Trp Arg Gin 



495 504 513 532 53, 

'ir,'^^'::'^'^^'^^ TAG 111 

He Ala Lya Phe Phe Lys Asp Tyr Pro Glu '^n Leu III HI cxl HI ^1 ^A^n 

Figure 13CU 
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549 ^^^°"> (continued) 
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(60P3) 



ATG GAT CT^ ACA ^AG CTG GGG ATC kH CTO AGG aaC GAG T^I CAG GCA Jl 

Met A.P I.eu Thr Lys V.l III ul ^ll Zl ZIC 'oZ 

"^2 Bl 90 - 

^! !!! !^ ^ !!! «^ ^^'^ CAC GGA AAG gS GAA CTG 

ASP Val Ala Uy. A.p Ar, Phe XI. IZ HI ll^ ly, HI HI "l 

n- L.U Gin Gly Val Glu Glu zH IZ ryi, HI '^l HI Zl Zl 

^"^^ 189 198 

-!!!!!!!! ^ ^= ATc GAG GCT cTx; Acc 

lie Phe Pbe Ala Gin Ala A^^ IZ Zl vll HI ~cl~ Zl Zl Zl ZZ L'^ 
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!!!!!!!!! ^ ^ ACT "J GGA AAA ^ 

Pro val ASP Thr Lys Lys Lys Gl^ 'vZ Zl IZ vZ 'rZ vZ Z^ Zy lyl gI^ 
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^ !!! !^!! !^ !!! '•^^^ «=A agJ GTc ATc S A«. 
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441 450 4fift 
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l-=u Asp A3P Tvr Tyr ,Vr Asp Gly Zl Zl Zl Zl Zl ZZ Zl Zl Zl Zl 

495 504 513 e-,-> 

AAo S GTA AAG ^ CTT crc ^ 

Thr Ilo Phe Arg Val Trp Ser Zl Zl IZ Z; ZZ IZ Zl Zl Zl Zl 

Figure 14^ 
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-.x«>..,. ,.„3, 
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^» "et Glu Tyr Lys Gly 
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AACCCGGTC«X;G;U.CCCG,,<n,a^«K:G;VT^^aACGG..S: 

A ^, "-^ '^^^ GTG TTC TAC CTC 

Aan Gly V»l Txp Glu Ala Val Vol Glu GW a"' r' --- 

Glu Gly Asp r^u A3P Gly Val Phe Tyr Leu 

"'^ «75 es. 



^!!^!!!!!^-!!^f----.CA^SG^CATSTAXTCoi;S 
^ Glu A,„ Tyr Gly .y," ^ ^ ^ ^ ^ ^ ^- 

V.X ^ ^ ^ ^ ~ --- --. „ _ _ _ 

774 783 

_C.. C« ^ 3^ ^ ^ _ ^ ^ 7„ ^ ^ 

- ^ ^ ^, ~ ~; --- --- -.- ... ^ 

83^5 82B 

!?! c« J!S ^ ^ - 

XI. II. TV, =1. x>. „. „~ ~ ~ ~ ~ --- --- --- 
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1089 1098 1107 

!^ f!! ^ ^ ^ «A ^'J^^ 

T,. ..r »,p ^„ p„ ^ ^; ~; - ~ --- -.. 

1143 1152 iigi 

CTC «X CCC C»C „C 0=T OCT C^'S ^.^l^ ^ 

V.1 M. Hi, ^y, ^, Cly „. „>, i;." H; ~ ^ ~ ^ ~ --- 

1197 iao6 1215 125^ 

CXC ACC T.C COT XX. OOC 

His TKr TV. Oly XI. oiy CIu s« ^ 'll '^l ~ 
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!tf ^f? ™ SI ^'IS ^ c^'^ 

P^. »r, II. ^Lr,r^^iy^~~~ ~ III - ~ ~ 

1305 1314 X223 ti-^-j 

™ -!! ^ !!:!^ !:!:^ S ATA o^'^i ^ ^^J^^ 

Val lie Ala Sar Glu Arg Pro Het Mel Zl vll i;;; ^ " 
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TAC ^ -TA AAG GA. TAX CAC ATA OAC CCA 0^ aS OAt'^ aOX; O^'Si 

Tyr Trp v.i oiu Hi, ii; "i; ~; ;~ ~- - --- ~- 

1413 1422 

!!f ^ <^ - A- c„ CAT^;:;:^ Axc cAT^ji; 

xie ASP ^,ys ^y, ^ - - ~l — --- --- --- 

Thr lie rie Tyr Gl^ Gl. |>ro Trp Gly Gl"; ^ HI l^l 

!:^! «^ °- ^ca'S^ r«: AAc'S; GAG tk:'^ 
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Figure 14C( Continued) 
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1638 

GGA TAC CCai AAC GAA ACC AAr .tv^^""' ^^^6 jg,. 
„ ^^'^ AAA AGG GOT Gl-r GIT GGA . 

:rvr cly uy, -- - -~ ^ -!! 

Gly Val val Gly ser xZ I 

1683 1632 'Vr 

CAC -A A^ AAA AC ^ ^ "JJ 171. 

x-i: z;: - p-^: r- - !^ !^ -c .ac 

y- Ser Phe AI. Leu Asp ppo GIu Glu tiut h; ~- 

GCA GCG TGT CAC GAC AAC 1"^" 1773 
. CT5 TGG GAC AAO AAC TAo 

cy. ... ^; ~; ~ -- ~ ~ ~ ~ ^ !ff ^ 
^ - - - - - ^ ^ 

n::: - - - - - ^ 

err GGT GCG ATA CTT CTC ACT TCT^JS OCT n^^""^ 1890 
^ ^ GTT CCT TTC CTC CAC r-r-» r--^ "° 

Ala Gly Ala Tl- i r ^''^ 

ly Ala II. ^„ - , 

-C rrc 3^ AGO ACC ACG AA, ^ 1^35 

-^AC AAC GCC CCT ATn iv-^ 

— ^ - - - - ~ ~ 

-AJA AAC GCC rrC GAT XAC GA. AGA^IIJ CTT CAG^J^J 

Asn Gly - - ; ^ GTG rrc AAT TAC 

-07 Z \ " ^'^^ ^ 

CAC AAG GGT CTC ATA AAX r~,v. 2034 20^i 

AAA GAA CAC CCT GCT TO- fit^ 

His .y. Oly rll Z;." " !!! ^ AAA AAC 

.oe. ,o,V ^' - - - 

CCT GAA OAO A^ AAA AAA CAC C^^^' ^ ^ 20„ 

A^ E ^ F F^^^ ^ -^^^ 
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Tb*zsotO0« Mri«lM rullttlaaaaa 



(60P3} (contlna«d) 



2178 2187 2 196 220 5 221 4 

*!! ^ ^ ^ *CA AAA CTG CCA GAA GCA AAA TCG 

lie Tyr Asn Gly Aan Leu Glu Lys Thr Thr Tyr Lys Leu Pro Glu Gly Lyc ^ 

2223 2232 2241 2250 2259 226fl 

'^Z'^^^*^'^^^'^<X:CCGAM:AGM<nG ATA GAA ACC GTC GAA 

Asn Val Val Val A«n S«r Gin Lys Ala Cly Thr Glu Vnl He Glu Thr vll clu 

2277 22 B6 2295 2304 2313 

GGA ACA ATA GAA CTC GAT CCG CTT TCC GCG TAG GTT CTC TAG AGA GAG OXy^ 3- 

Cly . Thr He Glu Leu Asp Pro Leu Ser Ala Tyr Val Leu Tyr Arg Glu 
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Pigu« 15. T/:ex™oto^a ^«ci.a „SB8 (Clone # eOP2) Glycosidase 



1 



CTT TTA TTG ATC GTT GAG CTC TCT TTC GTT CTC TTT GCA Z.rx n 

^„ J - - - 0« „C 

ACT GTT CTG GAG AGT GCC AGA GAC ATG GGT ATA AAG GTC CTC AGA ATC TGG 
Ser val .eu CIu Ser Ala A., Asp Met Gly xie .y. VaX .eu A^^ xL Z 

Oly Phe .eu ASP Gly Glu Se. Tyr Cys Arg Asp X.ys Asn Th. Tyr Met Zs 

CCT GAG CCC OCT GTT TTC GGG GTC- CCA OA.. CGA ATA TCG AAC GCC CAG AGC 
Pro Glu P.O Gly val P„e Gly Val P.o Clu Gly Xle Ser Asn Ala G^ sL 

_ ly Glu Ar g_J^sp„Tyr-Thr-Val-Al.a-I.ys-Ala-I,y3-Glu- L.u Gly Xl e" 

^ ^ vli r """^ "'^^ ^"'^ ^ -'^^ -° 

ys Leu val Xle Val val Asn Asn Trp Asp Asp Pbe Gly cly Met Asn 

Tyr val Arg Trp Phe Gly Gly Thr His His Asp Asp Phe Tyr Arg Asp 

GAG AAG ATC AAA CAA GAG TAC AAA AAG TAC GTC TCC TTT CTC GTA AAC CAT 
Clu .ys Xle .ys Clu Glu Tyr .ys .ys Tyr Val Ser Phe .eu Val Asn 

"l Z Thr r '^^^ -C 

Thr Tyr Thr Gly Val Pro ^r Arg Clu Glu Pro Thr He Met Ala 

TGG GAG CTT GCA AAC GAA CCG CGC TGT GAG ACG GAC AAA TCG GGG .nr- 
Leu Ala Asn Glu Pro Arg Cys Glu Thr Asp .ys Ser Gly Asn Thr 
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TTC AAA CCT TAC GGT GOA CAA CCC GAG TGG GCC TAC AAC GGC TGG TCC GGT 
Phe .,3 P.O ryr G.y Gl, Glu Ala Glu A.a X.. Asn 31, rZ sL 

OTT GAC TGG AAG AAG CTC CTT TCG ATA GAG ACG GTG GAC TTC GGC ACG TTC 
Val ASP T^ .ys .y. ... ^er lie Glu T.. val Asp P.e Gly t" Zl 

CAC CTC TAT CCG TCC CAC TGG GGT GTC AGT CCA GAG AAC TAT GCC CAG TGG 
H.S .eu T^ Se. His T^ Gl, v.l Se. P.o Glu Asn Ala G^. ::p 

GGA GCG AAG TGG ATA GAA GAC CAC ATA AAG ATC GCA AAA GAG ATC GGA AAA 
01>^ Ala Trp lie Glu Asp His lie .ys lie Ala .ys Glu lie Gly ^ 

Z IZ Z Z T T ^ °- 

val Leu Glu Glu Tyr Gly He ,ro Lys Ser Ala Pro Val Asn Arg 

x'hr Ala T -A GAT 

Ala lie ryr Arg .eu T^ Asn Asp .eu Val Tyr Asp l,eu Gly Gly Asp 

"y Ai! zi z r r 

ly Met Phe Trp Met Leu Ala Gly He Gly Glu Gly Ser Asp Arg Asp 

Glu r r '^'^ ^^'^ ^-^^ AAC GAC GAC 

Ol. Arg Gly T^r Tyr Pro Asp Tyr Asp Gly P.e Arg Xle Val Asn Asp "p 

s" pr" r T r ""'^ ^ 

^er Pro Glu Ala Glu Leu He Arg Glu Tvr Ala Tx,= r 

y V.4.U lyr Ala Lys Leu Phe Asn Thr Gly 

GAA GAC ATA AGA GAA GAC ACC TGC TCT TTP A-rn 

Glu Asp He Ar« . ^ "^"^ """^ ^ °AC GGC ATG 

P lie Arg Glu Asp Thr Cys Ser Phe He Leu Pro Lys Asp Gly Met 

OAG ATC AAA AAG ACC GTG GAA GTG AGG GCT GGT GTX TTC GAC TAC AGC AAC 

Figure 15b (continued) 
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=lu XI. .,3 .ys Thr vaz OXu Val Arg Ala Cly val Phe A ^ 

^ Asp Tyr Ser Asn 

ACG TTT GAA AAG TTG TCT GTC AAA GTC r^^ ^ 
Thr Glu .,3 .eu Ser VaX Ts Val g^ a 

ys Glu Asp Leu val Phe Glu Asn Glu 

ATA GAG CAT CTC GGA TAG GGA ATT TAC GGC 

X^e Glu H.S .eu Gl. ryr Gl, xie ^ g" Z T 

■ly Pile Asp Leu Asp Thr Thr Arg 

ATC CCG gat GGA GAA CAT GAA ATG TTC CTT GAA r.. 

- - ^ 0. z z 

ACG GTG AAA GAC TCT ATC AAA GCG AAA GTG r-m 

- v« „^ - - - «c 

^ V- - z z z z z zr— 

Giu Val Lys Asn Trp Trp 

AAC AGC GGA ACC TGG CAG GCA GAG TTC am 

3, ^ - - - - ^ 

«v z - - - - - ^ ^ ^ 

y Ala Leu Gin Leu Asn Val Lys Leu Pro Gly Lys 
- -AGC GAC TGG gTG AGA GTA GCA AGG I 

.rp Clu Glu val AT. val a" L l^: p^L^ ^ ^ 

y i-ys Pfte Glu Arg Leu Ser Glu 

TGT GAG ATC CTC GAG TAC GAC ATC TAC ATT CCA A.n 

- - - - - - - 

r„ Al. L,u AS„ p„ 

<^ ATO AAC AAC CCG AAC GTG <5AA ACT Cc= r.^ 

A„ A.. A., A.. G. z z z z 

GGA AAA GAG TAC AGA AGA TTC CAT GTA ao. . 

--y ... A,, A., ™ ^ z z z r r ^ ^ 

^lu Phe Asp Arg Thr Ala 
Figure l5C(continued) 
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GGG GTG AAA GAA CTT CAC ATA GGA GTT 
Gly Val Lys Glu Leu His lie Gly Val 

GGA CCG ATT TTC ATC GAT AAT GTG AGA 
Gly Pro lie Phe He Asp Asn Val Arg 

TGA 1991 
END 



GTC GGT GAT CAT CTG AGG TAG GAT 
Val Gly Asp His Leu Arg Tyr Asp 

CTT TAT AAA AGA ACA GGA GGT ATG 
Leu Tyr Lys Arg Thr Gly Gly Met 



Figure 15cl( continued) 
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€0 
20 



Figure Ko. l^Thermotoga mariti,^ MSB8,6gb4, 

"1 ™= =" ATG AM OA, o»i „c rrc ... 

" «. X„ v.. „„ - - - - ™ »A «C ^ OA, ^ 

y i-iu lie Glu Asp Arg Glu Trp He go 



0 

eo 



^^r 160 



r eiy He Trp Lys Pro Val Tyr Leu Glu 180 

- z z z z z r r - — - - ™ 

f «j>n vai Gly Lys Pro 250 
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III r r ^'^'^ TTC GTT XrC CTG TTG AAA GAC TTA AAC GGA CAG ATC TAC AGA GAA GAA 8.0 

261 Tyr Leu Tyr Asp Phe Val Phe Val Leu Lys Asp Leu Asn oly Glu He Tyr Arg Glu Glu 280 

HI ^ «^ AGA ATC GTT CAG GAG CCC GAT GAA GAA GGA AAA ACT SOO 

28X Lys Lys He GXy Leu Ar, Arg Val Ar. Xle Val Oln Glu Pro Asp Glu Glu Gly Lys T.r 300 

901 TTC ATA TTC GAA ATC AAC GGT GAG AAA GTC TTC GCT AAG GGT GCT AAd TGG ATT CCC TCA 

301 PHe Xle Phe Glu Xle Asn Gly Glu Lys Val P.e Ala Lys Gly Ala Asn Trp xie Pro sTr 2 

961 GAA AAC ATC CTC ACG TGG TTG AAG GAG GAA GAT TAC GAA AAG CTC GTC AAA ATG GOV AGG 1020 

321 Glu Asn Xle Leu T.r Trp Leu Lys Glu Glu Asp Tyr Glu Lys Leu Val Lys Met Al^ ^ 3! 

1021 AGT GCC AAT ATG AAC ATG CTC AGG GTC TGG GGA GGA GGA ATC TAC GAG AGA GAG ATC TTC 1 
Ser Ala Asn Met Asn Met Leu Arg Val Trp Gly Gly Gly lie Tyr Glu Arg Glu lie Phe 



0 



080 
360 



1081 TAC AGA CTC TGT GAT GAA CTC GGT ATC ATG GTG TGG CAG GAT TTC ATG TAC GCG TGT CTT 1140 

Tyr Arg Leu Cys Asp Glu Leu Gly lie Met Val Trp Gin Asp Phe Met Tyr Ala Cys Leu 380 

1141 GAA TAT COG GAT CAT CTT CCG TCG TTC AGA AAA CTC GCG AAC GAA GAG GCA AGA AAG ATT 1200 
3 81 Glu Tyr Pro Asp His Leu Pro Trp Phe Arg Lys Leu Ala Asn Glu Glu Ala Arg Lys He 



le 400 



1201 GTG AGA AAA CTC AGA TAC CAT CCC TCC ATT GTT CTC TGG XGC G3A A.C ^-..C GAA AAC AAC 1260 
Val Arg Lys Leu Arg Tyr His Pro Ser lie Val Leu Trp Cys Gly Asn Asn Glu Asn Asn 



420 



T ^'^ '^'^ '^'^ °^ '^'^ ^ *TC AAC CTC GGA AAC 1320 

Trp Gly Phe Asp Glu Trp Gly Asn Met Ala Arg Lys Val Asp Gly lie Asn Leu Gly Asn 44 0 

1321 AGG CTC TAC CTC TTC GAT TTT CCT GAG ATT TGT GCC GAA GAA GAC CCG TCC ACT CCC TAT 1380 

Arg Leu Tyr Leu Phe Asp Phe Pro Glu lie Cys Ala Glu Glu Asp Pro Ser Thr Pro Tyr 460 

^l?. '''^ ^""^ '^'^ °^ ^ ^= «^ A^ 1440 
Trp Pro Ser Ser Pro Tyr Gly Gly Glu Lys Ala Asn Ser Glu Lys Glu Gly Asp Arg HU 



is 480 



val Trp Tyr Val Trp Ser Gly Trp Met Asn Tyr Glu Asn Tyr Glu Lys Asp Thr Gly Arg 500 

'loi r r '^'^ ^ «T CAG GGT GCT CCC O.T CCA GAG ACG ATA GAG TTC TTT TCA 1560 

Phe Xle ser Glu Phe Gly Phe Gin Gly Ala Pro His Pro Glu Thr Xle Glu Phe Phe Ser S20 

I-ys Pro Glu Glu Arg Glu Xle Phe His Pro Val Met Leu Lys His Asn Lys Gin Val Glu 540 

Figure 16b(continued) 
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1«1 GGA CAG GAA AGA TTC ATC AGG TTC ATA TTC GGA ^ 

9 Pro Lys Ala Leu Tyr Tyr Tyr 620 

1861 GCG AGA AGA TTC TTC GCT f»» 

1981 CGA GAA GAA GGG AGA AAA GGT ATT CGA AAA r^.. 

2041 TGT GAG TTT GGT TGA 2055 
681 Cys Glu Phe Gly End 6B5 



Figure I6C(continued) 
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Figure No. ITa-Bankia gouldi (37gp4) 

1 ATG AAA AAA AAT CTA CTA ATG TXT AAA AGO CTT ACQ TAT CTA CCT TTG TTT TTA ATG CTG 

1 Met Lys Lys Asn Leu Leu Met Phe Lys Arg Leu Thr Tyr Leu Pro Leu Phe Leu Met Leu 

61 CTC TCA CTA ACT TCA GTA GCT CAA TCT CCT GTA GAA AAA CAT GGC CGT TTA CAA GTT GAC 

21 Leu ser Leu Ser Ser Val Ala Gin Ser Pro Val Glu Lys His Gly Arg. Leu Gin Val Asp 

121 GGA AAC CGC ATT CTT AAT GCG TCT GGA GAA ATT ACG AGC TTA GCT GGT AAC AGC CTC TTT 

41 Gly Asn Arg He Leu Asn Ala Ser Gly Glu He Thr Ser Leu Ala Gly Asn Ser Leu Phe 

181 TGG AGT AAT GCT GGA GAC ACC TCC GAT TTT TAT AAT GCA GAA ACT GTT GAT TTT TTA GCA 

61 Trp Ser Asn Ala Gly Asp Thr Ser Asp Phe Tyr Asn Ala Glu Thr Val Asp Phe Leu Ala 

241 GAA AAC TGG AAT AGC TCA CTT ATT AGA ATA GCT ATG GGC GTA AAA GAA AAT TGG GAT GGC 

81 Glu Asn Trp Asn Ser Ser Leu He Arg He Ala Met Gly Val Lys Glu Asn Trp Asp Gly 

301 GGA A;^.T GGC TAT ATT GAT AGT CCG CAG GAG C?.^ GAA GCT AAA ATT AGA AAA GTT ATT GAT 

-01 Gly Asn Gly Tyr He Asp Ser Pro Gin Glu Gin Glu Ala Lys He Arg Lys Val He Asp 

361 GCA GCT ATT GCT AAC GGC ATA TAT GTA ATA ATA GAC TGG CAC ACT CAC GAA GCA GAG TTA 

121 Ala Ala He Ala Asn Gly He Tyr Val He He Asp Trp His Thr His Glu Ala Glu Leu 

421 TAC ACA GAT GAG GCT GTT GAC TTT TTT ACC AGA ATG GCA GAC CTA TAC GGA GAT ACT CCC 

141 Tyr Thr Asp Glu Ala Val Asp Phe Phe Thr Arg Met Ala Asp Leu Tyt Gly Asp Thr Pro 

481 AAT GTA ATG TAT GAA ATT TAT AAC GAG CCT ATA TAC CAA AGT TGG CCT GTT ATT AAG AAT 

161 Asn Val Met Tyr Glu He Tyr Asn Glu Pro He Tyr Gin Ser Trp Pro Val He Lys Asn 

541 TAT GCA GAG CAA GTA ATT GCT GGT ATA CGT TCT AAA GAC CCA GAT AAT TTA ATA ATT GTA 

181 Tyr Ala Glu Gin Val He Ala Gly He Arg Ser Lys Asp Pro Asp Asn Leu He He Val 

601 GGT ACT AGC AAT TAT TCT CAG CAA GTT GAT GTA GCA TCA GCA GAC CCA ATA TCT GAT ACT 

201 Gly Thr Ser Asn Tyr Ser Gin Gin Val Asp Val Ala Ser Ala Asp Pro He Ser Asp Thr 

661 AAT GTG GCA TAT ACT TTA CAT TTT TAT GCA GCA TTT AAC CCG CAT GAT AAC TTA AGA AAT 

221 Asn val Ala Tyr Thr Leu His Phe Tyr Ala Ala Phe Asn Pro His Asp Asn Leu Arg Asn 



60 
20 

120 
40 

180 
60 

240 
80 

300 
100 

360 
120 

420 
140 

480 
160 

S40 
180 

600 
200 

660 
220 

720 
240 



721 GTA GCA CAG ACA GCA TTA GAT AAT AAT GTT GCT TTG TTT GTT ACA GAA TGG GGT ACA ATT 
val Ala Gin Thr Ala Leu Asp Asn Asn Val Ala Leu Phe Val Thr Glu Trp Gly Thr H 



780 
e 260 
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840 
0 



eu He ser Asn Lys Leu Thr Ala 320 

961 TCT GC5T C3AA ATT GTA AAA AAC ATC ATC ^.n 

- - - ... .„ - 2 z zi r - - - r - 

' Ala Asn Gly Asn s^^t- tv^ > 

y /isn ser Thr Asn Pro He He 400 

1201 TTA AGA GGC GAA AGC GCT ACA AAC CCT r^^n. 

- - ^ z z z z z z r -^-^ ..a. 

P Tyr Trp Asn He Lys Asp He Glu Phe Lys Thr Gly 440 

^^^•zzzzzzzzzzzrr-'-'-^-'-^ - 

Gly ser Lys Leu Lys Asn Leu Val Val His 460 

^'^'T.ZZZZZZZZZZZZ-'-'---' 

P Gly ser ser Asn Asn Ser He Asp Gly 4 80 

1*41 TGC ACT ATA TAG AAT ACA GGT AGA ACT AA* 

-1 Cys Thr Xle Tyr Asn Thr Gly ^ rZ Z 1 ^ ™ °- "0° 

Arg Thr Lys Pro Gly Phe Gly Glu Gly Leu Tyr Val Gly SOO 

- ---otcTHt::;:::::::::r:;--cAATAACACTATTGAAAAc .... 

Arg Ala Cys Asn Asn Asn Thr He Glu Asn S20 

X5G1 TGT ACr* 

Figure 17b(continued) 
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1621 ACT ATT ATA AGA AAT TGC GTG TTT TCT GCA GAA GGA ATT TCA GGA GAA AAT AGC TCA GAT Xsao 

S41 Thr lie lie Arg Asn Cys Val Phe Ser Ala Glu Gly lie Ser Gly Glu Asn Ser Ser Asp sso 

1681 GCT TTT ATT GAT TTA AAA GGA GCC TAT GGT TTT GTA TAC AGA AAC ACG TTT AAT GTT GAT 1740 

561 Ala Phe lie Asp Leu Lys Gly Ala Tyr Gly Phe Val Tyr Arg Asn Thr Phe Asn Val Asp seo 

1741 GGT TCT GAA GTA ATA AAT ACT GGA GTA GAC TTT TTA GAT AGA GGT ACA GGA TTT AAT ACA l«or> 

581 Gly ser Glu Val Ue Asn Thr Gly Val Asp Phe Leu Asp Arg Gly Thr Gly Phe Asn Thr 60o 

1801 GGT TTT AGA AAT GCA ATA TTT GAA AAT ACA TAT AAC CTT GGC AGT AGA GCT TCA GAA AIT 1860 

601 Gly Phe Arg Asn Ala He Phe Glu Asn Thr Tyr Asn Leu Gly Ser Arg Ala Ser Glu He 620 

1861 TCA ACT GCT CGT AAA AAA CAA GGT TCT CCT GAA CAA ACT CAC GTT TGG GAT AAT ATT AGA 1920 

621 ser Thr Ala Arg Lys Lys Gin Gly Ser Pro Glu Gin Thr His Val Trp Asp Asn lie Arg 640 

1921 AAC CCT AAT TCT GTT GAT TTT CCA ATA AGT GAT GGT ACA GAA AAT CTA GTA AAT AAA TTC 19B0 

641 Asn Pro Asn Ser Val Asp Phe Pro He Ser Asp Gly Thr Glu Asn Leu Val Asn Lys Phe 660 

1981 TGC CCA GAT TGG AAT ATA GAA CCA TGT AAT CCT GTA GAC GAA ACC AAC CAA GCA CCT ACA 2040 

661 Cys Pro Asp Trp Asn He Glu Pro Cys Asn Pro Val Asp Glu Thr Asn Gin Ala Pro Thr 680 

2041 ATA AGC TTC CTA TCT CCT GTT AAC AAT ATT ACT TTA GTT GAA GGT TAT AAT TTA CAA GTT 2100 

oai He Ser Phe Leu Ser Pro Val Asn Asn He Thr Leu Val Glu Gly Tyr Asn Leu Gin Val 700 

2101 GAA GTT AAT GCT ACT GAT GCA GAT GGA ACT ATT GAT AAT GTA AAA CTT TAT ATA GAT AAC 2160 

701 Glu val Asn Ala Thr Asp Ala Asp Gly Thr He Asp Asn Val Lys Leu Tyr He Asp Asn 720 

2161 AAT TTA GTT AGG CAA ATA AAT TCT ACT TCA TAT AAA TGG GGC CAT TCT GAT TCT CCA AAT 2220 

721 Asn Leu val Arg Gin He Asn Ser Thr Ser Tyr Lys Trp Gly His Ser Asp Ser Pro Asn 740 

2221 ACA GAT GAA CTT AAT GGT CTT ACA GAA GGA ACT TAT ACC TTA AAA GCA ATT GCA ACT GAT 2280 

741 Thr Asp Glu Leu Asn Gly Leu Thr Glu Gly Thr Tyr Thr Leu Lys Ala He Ala Thr Asp 760 

2281 AAC GAC GGG GCT TCT ACA GAA ACG CAA TTT ACG TTA ACT GTA ATA ACA GAA CAA AGT CCG 2340 

Asn Asp Gly Ala Ser Thr Glu Thr Gla Phe Thr Leu Thr Val He Thr Glu Gin Ser Pro 780 

2341 TCT GAG AAT TGT GAC TTT AAT ACA CCT TCT TCA ACT GGT TTA GAA GAT TTT GAC ATT AAA 2400 

Ser Glu Asn Cys Asp Phe Asn Thr Pro Ser Ser Thr Gly Leu Glu Asp Phe Asp He Lys 000 

2401 AAG TTT TCT AAC GTT TTT GAG TTA GGA TCT GGC GGA CCA TCT TTA AGT AAT TTA AAA ACA 2460 

Figure 17fl,.(contlnued) 
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2581 GCA AAT CCA GAA ATA Trr . 

« Lys Thr Asa Asn Phe Thr He Tyr 900 
Ser Asn Gin He ser Lys 920 

2761 ATT ACT GAT GAT TCT AGT ATT AAT T-T A^r ^ 

^.^-.z - z: r - - - - r - - 



Figure 17^ (continued) 
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Figure No. lacw Pyrococcua furiosus VC1(7EG1) 



leader sequence: amino acids 1-24 



S ■ 



» 18 27 36 45 

ATG AGC AAG AAA AAG TTC GTC ATC GTA TCT ATC TTA ACA ATC CTT TTA GTA CAG 
Met Ser Lys Lys l,ys Phe Val He Val Ser He Leu Thr He Leu Leu Val Gin 

" '2 81 90 99 

GCA ATA TAT TTT GTA GAA AAG TAT CAT ACC TCT GAG GAC AAG TCA ACT TCA AAT 
Ala He Tyr Phe Val Glu Lys Tyr His Thr Ser Glu Asp Lys Ser Thr Ser Aan 

117 126 135 144 153 I62 

ACC TCA. TCT ACA CCA CCC CAA ACA ACA CTT TCC ACT ACC AAG GTT CTC AAG ATT 
Thr Ser Ser Thr Pro Pro Gin Thr Thr Leu Ser Thr Thr Lys Val Leu Lys He 

171 180 189 198 207 216 

AGA TAC CCT GAT GAC GGT GAG TGG CCA GGA GCT CCT ATT GAT AAG GAT GGT GAT 
Arg Tyr Pro Asp Asp Gly Glu Trp Pro Gly Ala Pro He Asp Lys Asp Gly Asp 

"5 234 243 2S2 261 270 

GGG AAC CCA GAA TTC TAC ATT GAA ATA AAC CTA TGG AAC ATT CTT AAT GCT ACT 
Gly Asn Pro Glu Phe Tyr He Glu He Asn Leu Trp Asn He Leu Asn Ala Thr 

288 297 306 315 324 

GGA TTT GCT GAG ATG ACQ TAC AAT TTA ACC AGC GGC GTC CTT CAC TAC GTC CAA 
Gly Phe Ala Glu Met Thr Tyr Asn Leu Thr Ser Gly Val Leu His Tyr Val Gin 

342 351 360 369 378 

CAA CTT GAC AAC ATT GTC TTG AGG GAT AGA AGT AAT TGG GTG CAT GGA TAC CCC 
Gin Leu Asp Asn He Val Leu Arg Asp Arg Ser Asn Trp Val His Gly Tyr Pro 

387 396 405 414 423 432 

GAA ATA TTC TAT GGA AAC AAG CCA TGG AAT GCA AAC TAC GCA ACT GAT GGC CCA 
Glu He Phe Tyr Gly Asn Lys Pro Trp Asn Ala Asn Tyr Ala Thr Asp Gly Pro 

«0 459 468 477 486 

ATA CCA TTA CCC AGT AAA GTT TCA AAC CTA ACA GAC TTC TAT CTA ACA ATC TCC 
He Pro Leu Pro Ser Lys Val Ser Asn Leu Thr Asp Phe Tyr Leu Thr He Ser 
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TAT AAA CTT GAG CCC AAG AAC GGC CTG CCA ATT A.o ^ 

- - - c. ^ - - - - - ™ 



553 567 



567 e7£: 

TTA ACG AGA GAA GOT TGG AGA ACA ACA GG» »^ 

Thr A.3 Olu Ala Trp ^ T CAA GAA GTA 

g Thr Thr Gly He ^sn Ser Asp Glu Gl„ oi^ val 



^" 



ATG ATA TGG ATT TAG TAT GAG GGA TTA CAA rnr^ I 

... ^ ™ - - ^ 



657 666 675 



675 684 
»TT OTA GTC CC» »TA ,TA GTT AAC OCA ACA CCA OTA . 
n. V. ... - CTA AAT OCT AC. TTT OAA O. 



''-^ 720 



TGG AAG GCA AAC ATT GGT TGG GAG TAT GTT GCA ^ AGA ATA I'' 
X. Aa. A. ... ^ - - - - .CC CCA ATC 



774 783 



AAA GAG GGA ACA GTG ACA ATT CCA TAC GGA GCA ^ ATA An. 
Ws Glu Gly Thr Val Thr He Pro Tyr Gly Al^ Z T " 

^ Ala Phe I le_Ser-Va-l-A-l-a-Ala-Asn- 



828 837 



ATT TCA AGC TTA CCA AAT TAC ACA GAA CTT TAC TTA GAG 
- ser ser .e. p. a. ... ... ^ ^ ^ ^ ^ - ATT GGA 



^■^^ 882 



ACT CAG TTT OCA ACO CCA AOC ACT ACC TCC CCC CAC CT. 
- o.. T. ^ - - - - - - ATC ACA 

92 7 

ATA ACA CTA ACT CTA OAT ^ OCT CTT A^ TCC TAA ,. 

T^ T., .„ A.P A,, P„ ^ T " 



Figure 18b(continued) 
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GLYCOSIDASE ENZYMES 
BACKGROUND OF THE INVENTION 

1 . Field of the Inventions 

This invention relates to newly identified polynucleotides, polypeptides encoded by such 
5 polynucleotides, the use of such polynucleotides and polypeptides, as well as the 
production and isolation of such polynucleotides and polypeptides. More particularly, 
the polynucleotides and polypeptides of the present invention has been putatively 
identified as glucosidases, a-galactosidases, p-galactosidases, B-mannosidases, 6- 
mannanases, endoglucanases, and pullalanases. 

1 0 2. Description of Related Art 

The glycosidic bond of P-galactosides can be cleaved by different classes of enzymes: 
(i) phospho-P-galactosidases (ECS .2. 1.85) are specific for a phosphoiylated substrate 
generated via phosphoenolpyruvate phosphotransferase system (PTS)-dependent uptake; 

(ii)"typieal-p-galactosidases-(E<^3^rl^3)rrepresented-by-the-£yc^ 

15 enzyme, which are relatively specific for P-galactosides; and (iii) P-glucosidases (EC 
3.2.1.21) such as the enzymes of Agrobacterium faecalis, Clostridium thermocellum, 
Pyrococcus furiosus or Sulfolobus solfataricus (Day, A.G. and Withers, S.G., (1986) 
Purification and characterization of a p-glucosidase from Alcaligenes faecalis. Can. J. 
Biochem. Cell. Biol. 64, 914-922; Kengen, S.W.M., et al. (1993) Eur, J. Biochem., 213, 

20 305-3 12; Ait, N., Cniezet, N. and Cattaneo, J. (1 982) Properties of P-glucosidase purified 
from Clostridium thermocellum, J. Gen. Microbiol. 128, 569-577; Grogan, D.W. (1991) 
Evidence that p-galactosidase of Sulfolobus solfataricus is only one of several activities 
of a thermostable p-D-glycodiase. Appl. Environ. Microbiol. 57, 1644-1649). Members 
of the latter group, although highly specific with respect to the p-anomeric configuration 

25 of the glycosidic linkage, often display a rather relaxed substrate specificity and 
hydrolyze p-glucosides as well as p-fiicosides and P-galactosides. 
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Generally, a-galactosidases are enzymes that catalyze the hydrolysis of galactose groups 
on a polysaccharide backbone or hydrolyze the cleavage of di- or oligosaccharides 
comprising galactose. 

Generally, B-mannanases are enzymes that catalyze the hydrolysis of mannose groups 
5 internally on a polysaccharide backbone or hydrolyze the cleavage of di- or 
oligosaccaharides comprising mannose groups. B-mannosidases hydrolyze non-reducing, 
terminal mannose residues on a maimose-containing polysaccharide and the cleavage of 
di- or oligosaccaharides comprising mannose groups. 

Guar gum is a branched galactomannan polysaccharide composed of p-1,4 linked 
10 mannose backbone with a- 1,6 linked galactose side chains. The enzymes required for 
the degradation of guar are P-mannanase, p-mannosidase and a-galactosidase. P- 
mannanase hydrolyses the mannose backbone internally and P-mannosidase hydrolyses 
non-reducing, terminal mannose residues, a-galactosidase hydrolyses a-linked galactose 
groups. 

1 5 Galactomannan polysaccharides and the enzymes that degrade them have a variety of 
applications. Guar is commonly used as a thickening agent in food and is utilized in 
hydraulic fracturing in oil and gas recovery. Consequently, galactomannanases are 
industrially relevant for the degradation and modification of guar. Furthermore, a need 
exists for thermostable galactomannases that are active in extreme conditions associated 

20 with drilling and well stimulation. 

There are other applications for these enzymes in various industries, such as in the beet 
sugar industry. 20-30% of the domestic U,S. sucrose consumption is sucrose from sugar 
beets. Raw beet sugar can contain a small amount of raffinose when the sugar beets are 
stored before processing and rotting begins to set in. Raffinose inhibits the 
25 crystallization of sucrose and also constitutes a hidden quantity of sucrose. Thus, there 
is merit to eliminating raffinose fi*om raw beet sugar. a-Galactosidase has also been used 
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as a digestive aid to break down raffinose, stachyose, and verbascose in such foods as 
beans and other gassy foods. 

p-galactosidases which are active and stable at high temperatures appear to be superior 
enzymes for the production of lactose-fr^e dietary milk products <Chaplin, M.F. and 
5 Bucke, C. (1990) In: Enzyme Technology, pp. 159-160, Cambridge University Press, 
Cambridge, UK). Also, several studies have demonstrated the applicability of P- 
galactosidases to the enzymatic synthesis of oligosaccharides via transglycosylation 
reactions (Nilsson, K.GJ. (1988) Enzymatic synthesis of oligosaccharides. Trends 
Biotechnol. 6, 156-264; Cote, G.L. and Tao, B.Y. (1990) Oligosaccharide synthesis by 

10 enzymatic transglycosylation. Glycoconjugate J. 7, 145-162). Despite the commercial 
potential, only a few p-galactosidases of thermophiles have been characterized so far. 
Two genes reported are P-galactoside-cleaving enzymes of the hyperthermophilic 
bacterium Thermotoga maritima, one of the most thermophilic organotrophic eubacteria 
described to date (Huber, R., Langworthy, TA., Konig, H., Thomm, M., Woese, C.R., 

15 Sleytr, U.B. and Stetter, K.O. (1986) T, martima sp. nov. represents a new genus of 
unique extremel^nhemiopKilic^e^ growing up to 90 ^*C, Arch. Microbiol. 144, 

324-333) one of the most thermophilic organotrophic eubacteria described to date. The 
gene products have been identified as a p-galactosidase and a P-glucosidase. 

Pullulanase is well known as a debranching enzyme of pullulan and starch. The enzyme 
20 hydrolyzes a-l,6-glucosidic linkages on these polymers. Starch degradation for the 
production or sweeteners (glucose or maltose) is a very important industrial application 
of this enzyme. The degradation of starch is developed in two stages. The first stage 
involves the liquefaction of the substrate with a-amylase, and the second stage, or 
saccharification stage, is performed by B-amylase with pullalanase added as a 
25 debranching enzyme, to obtain better yields. 

Endoglucanases can be used in a variety of industrial applications. For instance, the 
endoglucanases of the present invention can hydrolyze the internal B-l,4-glycosidic 
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bonds in cellulose, which may be used for the conversion of plant biomass into fuels and 
chemicals. Endoglucanases also have applications in detergent formulations, the textile 
industry, in animal feed, in waste treatment, and in the fruit juice and brevvdng industry 
for the clarification and extraction of juices. 
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Brief Description of the Drawings 

The following drawings are illustrative of embodiments of the invention and are not 
meant to limit the scope of the invention as encompassed by the claims. 

Figures la-b are the full-length DNA and corresponding deduced amino acid sequence 
5 of Ml ITL of the present invention. Sequencing was performed using a 378 automated 
DNA sequencer for all sequences of the present invention (Applied Biosystems, hic). 

Figure 2 is an illustration of the full-length DNA and corresponding deduced amino acid 
sequence of OC1/4V-33B/G. 

Figure 3 is an illustration of the full-length DNA and corresponding deduced amino acid 
10 sequence of F1-12G. 

Figures 4a-b are the full-length DNA and corresponding deduced amino acid sequence 
of9N2-31B/G. 



Figures 5a-b are the full-length DNA and corresponding deduced amino acid sequence 
ofMSB8-6G. 

15 Figure 6 is the full-length DNA and corresponding deduced amino acid sequence of 
AEDni2RA-18B/G. 

Figures 7a-b are the full-length DNA and corresponding deduced amino acid sequence 
ofGC74-22G. 

Figures 8a-b are the full-length DNA and corresponding deduced amino acid sequence 
20 ofVCl-7Gl. 

S 
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Figures 9a-c are the fall-length DNA and corresponding deduced amino acid sequence 
of37GPl. 

Figures lOa-c are the full-length DNA and corresponding deduced amino acid sequence 
of6GC2. 

5 Figures 1 1 a-d are the full-length DNA and corresponding deduced amino acid sequence 
of6GP2. 

Figures 12a-c are the full-length DNA and corresponding deduced amino acid sequence 
of63GBl. 

Figures 13a-b are the full-length DNA and corresponding deduced amino acid sequence 
10 ofOClMV. 

Figures 14a-e are the full-length DNA and corresponding deduced amino acid sequence 
of6GP3. 

Figures 15a-d are the full-length DNA and corresponding deduced amino acid sequence 
of Thermotoga maritima MSB8-6GP2. 

1 5 Figures 1 6a-c are the full-length DNA and corresponding deduced amino acid sequence 
of Thermotoga maritima MSB8-6GB4. 

Figures 17a-d are the full-length DNA and corresponding deduced amino acid sequence 
of Banki gouldi 37GP4. 

Figures 18a-b are the full-length DNA and corresponding deduced amino acid sequence 
20 of Pyrococcus furiosus VC 1 -TEG 1 . 
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SUMMARY OF THE INVENTION 

In a preferred embodiment of the present invention, there are provided isolated nucleic 
acids (polynucleotides) which encode mature enzymes having the deduced amino acid 
sequences of Figures 1-18 (SEQ ID NOS: 15-28 and 61-64). 

5 In another embodiment, the invention provides a method for producing a polypeptide 
including culturing host cells containing the polynucleotide of Figures 1-18 and 
expressing from the host cell a polypeptide encoded by the polynucleotide and isolating 
the polypeptide. 

In another embodiment, the invention provides an enzyme selected from the group 
1 0 consisting of an enzyme having an amino acid sequence set forth in SEQ ID NOS: 1 5-28 
or 61-64 and an enzyme which has at least 30 consecutive amino acid residue as an 
enzyme having an amino acid sequence set forth in SEQ ID NOS: 1 5-28 or 61-64. 

In yet another embodiment, the invention provides a method for generating glucose from 
soluble ceir"oligosacchandes whiclT^includes contacting a sample containing 
15 oligosaccharides with an effective amount of an enzyme selected from the group of 
enzymes having the amino acid sequence set forth in SEQ ID NOS: 1 5-28, 61-63 and 64 
such that glucose is produced 

The publications discussed herein are provided solely for their disclosure prior to the 
filing date of the present application. Nothing herein is to be construed as an admission 
20 that the invention is not entitled to antedate such disclosure by virtue of prior invention. 

Definitions 

"Monosaccharide", as used herein, refers to a single polyhydroxy aldehyde or ketone 
unit. -7 
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"Oligosaccharide", as used herein, consist of short chains of monosaccharide units joined 
together by covalent bonds. Of these, the most abundant are the disaccharides, which 
have two monosaccharide units. 

"Polysaccharide", as used herein, consists of long chains having many monosaccharide 
5 units. 

The term "gene" means the segment of DNA involved in producing a polypeptide chain; 
it includes regions preceding and following the coding region (leader and trailer) as well 
as intervening sequences (introns) between individual coding segments (exons). 

A coding sequence is "operably linked to" another coding sequence when RNA 
1 0 polymerase will transcribe the two coding sequences into a single mRN A, which is then 
translated into a single polypeptide having amino acids derived from both coding 
sequences. The coding sequences need not be contiguous to one another so long as the 
expressed sequences ultimately process to produce the desired protein. 

"Recombinant" enzymes refer to enzymes produced by recombinant DNA techniques; 
15 /.e., produced from cells transformed by an exogenous DNA construct encoding the 
desired enzyme. "Synthetic" enzymes are those prepared by chemical synthesis. 

A DNA "coding sequence of or a "nucleotide sequence encoding" a particular enzyme, 
is a DNA sequence which is transcribed and translated into an enzyme when placed 
under the control of appropriate regulatory sequences. 

20 Detailed Description of the Invention 

The polynucleotides and polypeptides of the present invention have been identified as 
glucosidases, a-galactosidases, P-galactosidases, B-mannosidases, B-mannanases, 
endoglucanases, and pullalanases as a result of their enzymatic activity. 
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In accordance with one aspect of the present invention, there are provided novel 
enzymes, as well as active fragments, analogs and derivatives thereof. 

In accordance wdth another aspect of the present invention, there are provided isolated 
nucleic acid molecules encoding the enzymes of the present invention including mRNAs, 
5 cDNAs. genomic DNAs as well as active analogs and fragments of such enzymes. 

In accordance with yet a further aspect of the present invention, there is provided a 
process for producing such polypeptides by recombinant techniques comprising culturing 
recombinant prokaryotic and/or eukaryotic host cells, containing a nucleic acid sequence 
of the present invention, under conditions promoting expression of said enzymes and 
1 0 subsequent recovery of said enzymes. 

In accordance with yet a further aspect of the present invention, there is provided a 
process for utilizing such enzymes, or polynucleotides encoding such enzymes for 
hydrolyzing lactose to galactose and glucose for use in the food processing industry, the 
pharmaceuticarinaustry,"~for example, to treat intolerance to lactose, as a diagnostic 
1 5 reporter molecule, in com wet milling, in the fruit juice industry, in baking, in the textile 
industry and in the detergent industry. 

In accordance with yet a further aspect of the present invention, there is provided a 
process for utilizing such en^ones for hydrolyzing guar gum (a galactomannan 
polysaccharide) to remove non-reducing terminal mannose residues. Further 

20 polysaccharides such as galactomannan and the enzymes according to the invention that 
degrade them have a variety of applications. Guar gum is commonly used as a 
thickening agent in food and also is utilized in hydraulic fi-acturing in oil and gas 
recovery. Consequently, maimanases are industrially relevant for the degradation and 
modification of guar gums. Furthermore, a need exists for thermostable mannases that 

25 are active in extreme conditions associated with drilling and well stimulation. 
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In accordance with yet a further aspect of the present invention, there are also provided 
nucleic acid probes comprising nucleic acid molecules of sufficient length to specifically 
hybridize to a nucleic acid sequence of the present invention. 

In accordance with yet a further aspect of the present invention, there is provided a 
5 process for utilizing such enzymes, or polynucleotides encoding such enzymes, for in 
vitro purposes related to scientific research, for example, to generate probes for 
identifying similar sequences which might encode similar enzymes fi^om other organisms 
by using certain regions, i.e., conserved sequence regions, of the nucleotide sequence. 

These and other aspects of the present invention should be apparent to those skilled in 
10 the art from the teachings herein. 

The polynucleotides of this invention were originally recovered from genomic gene 
libraries derived from the following organisms: 

Ml ITL is a new species of Desulfurococcus isolated from Diamond Pool in Yellowstone 
National Park. The organism grows optimally at 85-88°C, pH 7.0 in a low salt medium 
1 5 containing yeast extract, peptone, and gelatin as substrates with a N2/CO2 gas phase. 

OC1/4V is from the genus Thermotoga. The organism was isolated from Yellowstone 
National Park. It grows optimally at 75 "^C in a low saU medium v^th cellulose as a 
substrate and Nj in gas phase. 

Pyrococcusfuriosus VCl and (7EG1) is from the genus Pyrococcus, VCl was isolated 
20 from Vulcano, Italy. It grows optimally at lOC^C in a high salt medium (marine) 
containing elemental sulfur, yeast extract, peptone and starch as substrates and in gas 
phase. 

lo 
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Staphylothermus marinus Fl is a from the genus Staphylothermus. Fl was isolated from 
Vulcano, Italy. It grows optimally at 85 "^C, pH 6.5 in high salt medium (marine) 
containing elemental sulfiir and yeast extract as substrates and in gas phase. 

Thermococcus 9N-2 is from the genus Thermococcus 9N-2 was isolated from diffuse 
5 vent fluid in the East Pacific Rise. It is a strict anaerobe that grows optimally at 87 "^C. 

Thermotoga maritima MSB8 and MSB8 (Clone # 6GP2 and 6GB4) is from the genus 
Thermotogo, and was isolated from Vulcano, Italy. MSB8 grows optimally at 85 °C, pH 
6.5 in a high salt mediimi (marine) containing starch and yeast extract as substrates and 
N2 in gas phase. 

1 0 Thermococcus alcaliphilus AEDII 1 2RA is from the genus Thermococcus. AEDII 1 2RA 
grows optimally at 85 ''C, pH 9.5 in a high salt medium (marine) containing polysulfides 
and yeast extract as substrates and N2 in gas phase. 



Thermococcus-chitonophagus-GCrIA is"from~the-genus~77zemococcw^. GC74 grows 

optimally at 85 ""C, pH 6.0 in a high salt medium (marine) containing chitin, meat extract, 
15 elemental sulfur and yeast extract as substrates and N2 in gas phase. AEPII la grows 
optimally at 85 °C at pH 6.5 in manne medium under anaerobic conditions. It has many 
substrates. Bankia gouldi is from the genus Bankia. 

Accordingly, the polynucleotides and enzymes encoded thereby are identified by the 
organism from which they were isolated, and are sometimes hereinafter referred to as 

20 "Ml ITL" (Figure 1 and SEQ ID NOS:l and 15), "OC1/4V-33B/G" (Figure 2 and SEQ 
ID NOS:2 and 16), "F1-12G" (Figure 3 and SEQ ID NOS:3 and 17), "9N2-31B/G" 
(Figure 4 and SEQ ID NOS:4 and 18), "MSB8" (Figure 5 and SEQ ID NOS:5 and 19), 
"AEDni2RA-l 8B/G" (Figure 6 and SEQ ID NOS:6 and 20), "GC74-22G" (Figure 7 and 
SEQ ID NOS:7 and 21), "VCl-7Gr' (Figure 8 and SEQ ID NOS:8 and 22), "37GP1" 

25 (Figure 9 and SEQ ID NOS: 9 and 23), "6GC2" (Figure 10 and SEQ ID NOS: 10 and 
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24), "6GP2" (Figure 1 1 and SEQ ID NOS:l 1 and 25), "AEPII la" (Figure 12 and SEQ 
ID NOS:12 and 26), "OC1/4V" (Figure 13 and SEQ ID NOS:13 and 27), and "6GP3" 
(Figure 14 and SEQ ID NOS:28), "MSB8-6GP2" (Figure 15 and SEQ ID NOS:57 and 
61), "MSB8-6GB4"(Figure 16 and SEQ ID NOS:58 and 62),"VC1 -TEG 1 "(Figure 17 and 
5 SEQ ID NOS:59 and 63), and 37GP4 (Figure 1 8 and SEQ ID NOS:60 and 64). 

The polynucleotides and polypeptides of the present invention show identity at the 
nucleotide and protein level to known genes and proteins encoded thereby as shown in 
Table 1. 

Table 1 











Nucleic 






Oene/rrotem vvitn 


ProteiH: 


Acid 


10 


. Clone 


Closest Homology 


Identity 


Identity 




Ml 1TL-29G 


Sulfolobus sulfataricus 
DSM 1616/Pl,p- 
galactosidase 


51% 


55% 




OC1/4V-33B/G 


Caldocellum 
saccharolyticum, P- 
glucosidase 


52% 


57% 




Staphylothermus 


Bacillus polymyxa, P- 


36% 


48% 




marinus V\-\2G 


galactosidase 






15 


Thermococcus 9N2- 
31B/G 


Sulfolobus sulfataricus 
ATCC 49255/MT4, P- 
galactosidase 


51% 


50% 




Thermotoga maritima 


Clostridium themiocellum 


45% 


53% 




MSB8-6G 


bglB 








Thermococcus 


Bacillus polymyxa, p- 


34% 


48% 


20 


AEDni2RA-18B/G 


Ralactosidase 







12: 
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Thermococcus 
chitonophagus GC74- 
22G 


Sulfolobus sulfataricus 
ATCC 49255/MT4, p- 
galactosidase 


46% 


54% 


Pyrococcus furiosus 
VC1-7G1 


Sulfolobus 

sulfataricus/MT-4 

gedactosidase 


46.4% 


52.5% 


Thermotoga maritima 

a-galactosidase 

(6GC2) 


Pediococcus pentosaceaus 
a-galactosidase 


49% 


29% 


Thermotoga maritima 
B-mannanase (6GP2) 


Aspergillus aculeatus 
mannanase 


56% 


37% 


AEPU laB- 
mannosidase (63GB1) 


Sulfolobus solfactaricus B- 
galactosidase 


78% 


56% 


OC1/4V 

endoglucanase 

(33GP1-) 


Clostridium themiocellum 
endo- 1 ,4-B-endoglucanase 


65% 


43% 


Thermotoga maritiBaldo 
pullalanase (6GP3) 


cellum 
saccharolyticum a- 
destrom 6 
glucanohydralase 


72 


53 


Bankia gouldi mix 

Endoglucanase 

(37GP1) 


None available 







The polynucleotides and enzymes of the present invention show homology to each other 
as shown in Table 2. 



/2 
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Table 2 











Nucleic 






Gene/Prdtein with 


Protein 


Acid 




Clone 




Tflf*fitit\'' 


Tdentitv 




Staphylothermus 


i nerTnococcus 


55% 


57% 




marinus F1-12G 


AED1I12RA-18B/G, p- 
galactosidase, glucosidase 






5 


Thermococcus 9N2- 
31B/G 


Thermococcus 
chitonophagus GC74- 
22G-glucosidase' 


74% 


66% 




Pyrococcus furiosus 


Pyrococcus furiosus VCl- 


46.4% 


54% 




VC1-7G1 


7B/G P-galactosidase 







All the clones identified in Tables 1 and 2 encode polypeptides which have a-glycosidase 
10 or P-glycosidase activity. 

This invention, in addition to the isolated nucleic acid molecules encoding the enzymes 
of the present invention, also provide substantially similar sequences. Isolated nucleic 
acid sequences are substantially similar if: <i) they are capable of hybridizing under 
conditions hereinafter described, to the polynucleotides of SEQ ID NOS: 1-14 and 57-60; 

1 5 00 or *ey encode DN A sequences which are degenerate to the polynucleotides of SEQ 
ID NOS: 1-14 and 57-60. Degenerate DNA sequences encode the amino acid sequences 
of SEQ ID NOS: 15-28 and 61-64, but have variations in the nucleotide coding 
sequences. As used herein, substantially similar refers to the sequences having similar 
identity to the sequences of the instant invention. The nucleotide sequences that are 

20 substantially the same can be identified by hybridization or by sequence comparison. 
Enzyme sequences that are substantially the same can be identified by one or more of the 
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following: proteolytic digestion, gel electrophoresis and/or microsequencing. 

One means for isolating the nucleic acid molecules encoding the enzymes of the present 
invention is to probe a gene library with a natural or artificially designed probe using art 
recognized procedures (see, for example: Current Protocols in Molecular Biology, 
5 Ausubel F.M. et al (EDS.) Green Publishing Company Assoc, and John Wiley 
Literscience, New York, 1989, 1992). It is appreciated to one skilled in the art that the 
polynucleotides of SEQ ID NOS: 1-14 and 57-60 or fragments thereof (comprising at 
least 12 contiguous nucleotides), are particularly useful probes. Other particular useful 
probes for this purpose are hybridizable fragments to the sequences of SEQ ID NOS: 1 - 
10 14 and 57-60 (/.e., comprising at least 12 contiguous nucleotides). 

With respect to nucleic acid sequences which hybridize to specific nucleic acid 
sequences disclosed herein, hybridization may be carried out under conditions of reduced 
stringency, medium stringency or even stringent conditions. As an example of 
oligonucleotide hybridization, a polymer membrane containing immobilized denatured 

r5~ nucleic acidsis first prehybridized"fDr"30'min^ 

M NaCl, 50 mM NaH2P04, pH 7.0, 5.0 mM Na2EDTA, 0.5% SDS, lOX Denhardf s, and 
0.5 mg/ml polyriboadenylic acid. Approximately 2X10^ cpm (specific activity 4-9 X 
10^ cpm/ug) of^^ P end-labeled oligonucleotide probe are then added to the solution. 
After 12-16 hours of incubation, the membrane is washed for 30 minutes at room 

20 temperature in IX SET (150 mM NaCl, 20 mM Tris hydrochloride, pH 7.8, 1 mM 
Na2EDTA) containing 0.5% SDS, followed by a 30 minute wash in fresh IX SET at Tm 
10 ""C for the oligonucleotide probe. The membrane is then exposed to auto-radiographic 
film for detection of hybridization signals. 

Stringent conditions means hybridization will occur only if there is at least 90% identity, 
25 preferably at least 95% identity and most preferably at least 97% identity between the 
sequences. Further, it is imderstood that a section of a 100 bps sequence that is 95 bps 
in length has 95% identity v^th the 1090 bps sequence from which it is obtained. See J. 

ts 
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Sambrook et al^ Molecular Cloning, A Laboratory Manual 2d Ed., Cold Spring Harbor 
Laboratory (1989) which is hereby incorporated by reference in its entirety. Also, it is 
understood that a fragment of a 100 bps sequence that is 95 bps in length has 95% 
identity with the 100 bps sequence from which it is obtained. 

5 As used herein, a first DNA (RNA) sequence is at least 70% and preferably at least 80% 
identical to another DNA (RNA) sequence if there is at least 70% and preferably at least 
a 80% or 90% identity, respectively, between the bases of the first sequence and the 
bases of the another sequence, when properly aligned with each other, for example when 
aligned by BLASTN. 

10 "Identity" as the term is used herein, refers to a polynucleotide sequence which 
comprises a percentage of the same bases as a reference polynucleotide (SEQ ID NOS : 1 - 
14 and 57-60). For example, a polynucleotide which is at least 90% identical to a 
reference polynucleotide, has polynucleotide bases which are identical in 90% of the 
bases which make up the reference polynucleotide and may have different bases in 1 0% 

15 of the bases which comprise that polynucleotide sequence. 

The present invention relates polynucleotides which differ from the reference 
polynucleotide such that the changes are silent changes, for example the change do not 
alter the amino acid sequence encoded by the polynucleotide. The present invention also 
relates to nucleotide changes which result in amino acid substitutions, additions, 
20 deletions, fusions and truncations in the polypeptide encoded by the reference 
polynucleotide. In a preferred aspect of the invention these polypeptides retain the same 
biological action as the polypeptide encoded by the reference polynucleotide. 

It is also appreciated that such probes can be and are preferably labeled with an 
analytically detectable reagent to facilitate identification of the probe. Usefiil reagents 
25 include but are not limited to radioactivity, fluorescent dyes or enzymes capable of 
catalyzing the formation of a detectable product. The probes are thus useful to isolate 
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complementary copies of DNA from other sources or to screen such sources for related 
sequences. 

The polynucleotides of this invention were recovered from genomic gene libraries from 
the organisms listed in Table 1 . For example, gene libraries can be generated in the 
5 Lambda ZAP II cloning vector (Stratagene Cloning Systems). Mass excisions can be 
perfomied on these libraries to generate libraries in the pBluescript phagemid. Libraries 
are thus generated and excisions performed according to the protocols/methods 
hereinafter described. 

The excision libraries are introduced into the E. coli strain BW 14893 FkanlA. 
1 0 Expression clones are then identified using a high temperature filter assay. Expression 
clones encoding several glucanases and several other glycosidases are identified and 
repurified. The polynucleotides, and enzymes encoded thereby, of the present invention, 
yield the activities as described above. 

The" coding sequences"f6F the enzymes~6f Ihe presenFinvention were idSitified by 
1 5 screening the genomic DNAs prepared for the clones having glucosidase or galactosidase 
activity. 

An example of such an assay is a high temperature filter assay wherein expression clones 
were identified by use of high temperature filter assays using buffer Z (see recipe below) 
containing 1 mg/ml of the substrate 5-bromo-4-chloro-3-indolyl-P-D-glucopyranoside 
20 (XGLU) (Diagnostic Chemicals Limited or Sigma) after introducing an excision library 
into the E. coli strain BW14893 Pkanl A. Expression clones encoding XGLUases were 
identified and repurified fi"om MllTL, OC1/4V, Pyrococcus furiosus VCl, 
Staphylothemus marinus Fl, Thermococcus 9N-2, Thermotoga maritima MSB8, 
Thermococcus alcaliphilus AEDni2RA, and Thermococcus chitonophagus GC74. 
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Z-buffer: (referenced in Miller, J.H. (1992) A Short Course in Bacterial Genetics, p. 
445.) 

per liter; 

Na2HP04-7H20 16.1g 
5 NaH2P04-7H20 5.5g 

KCl 0.75g 

MgS04-7H20 0.246g 

p-mercaptoethanol 2.7ml 

Adjust pH to 7.0 
10 High Temperature Filter Assay 

(1) The f factor fkan (from E, coli strain CSHl 18)(1) was introduced into the 
pho-pnh-lac-strain BW14893(2). BW13893(2). The filamentous phage 
library was plated on the resulting strain, BW14893 Fkan. (Miller, J.H. 
(1992) A Short Course in Bacterial Genetics; Lee, K.S., Metcalf, et al., 

1 5 (1 992) Evidence for two phosphonate degradative pathways in Enterobacter 

Aerogenes, J. Bacteriol., 174:2501-2510. 

(2) After growth on 100 mm LB plates containing 100 Hg/ml ampicillin, 80 
}ig/ml nethicillin and ImM IPTG, colony lifts were performed using 
Millipore HATF membrane filters. 

20 (3) The colonies transferred to the filters were lysed with chloroform vapor in 

150 nun glass petri dishes. 
(4) The filters were transferred to 1 00 mm glass petri dishes containing a piece 

of Whatman 3MM filter paper saturated v^th buffer. 

(a) when testing for galactosidase activity (XGALase), 3MM paper 
25 was saturated with Z buffer containing 1 mg/ml XGAL (ChemBridge 

Corporation). After transferring filter bearing lysed colonies to the 
glass petri dish, placed dish in oven at 80-85 °C. 



1% 
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(b) when testing for glucosidase (XGLUase), 3MM paper was 
saturated with Z buffer containing 1 mg/ml XGLU. After transferring 
filter bearing lysed colonies to the glass petri dish, placed dish in 
oven at 80-85 X. 

5 (5) "Positives' were observed as blue spots on the filter membranes. Used the 

following filter rescue technique to retrieve plasmid fi-om lysed positive 
colony. Used pasteiur pipette (or glass capillary tube) to core blue spots on 
the filter membrane. Placed the small filter disk in an Eppendorf tube 
containing 20 ^1 water. Incubated the Eppendorf tube at 75 for 5 minutes 

10 followed by vortexing to elute plasmid DNA off filter. This DNA was 

transformed into electrocompetent E, coli cells DHIOB for Thermatoga 
maritima MSB8-6G, Staphylothermus marinus F1-12G, Thermococcus 
AEDII12RA-18B/G, Thermococcus chitonophagus GC74-22G, MllTl and 
OC1/4V. Electrocompetent BW14893 F'kanlA E, coli were used for 

15 Thermococcus 9N2-31B/G, and Pyrococcus furiosus VC1-7G1, Repeated 

filter-lift assay on transformation plates to idenfify ^positives'. Return 
transformatiOT plates to 37 °C"incubator afterfiher lift to regenerate colonies. 
Inoculate 3 ml LB liquid containing 100 |ig/ml ampicillin with repurified 
positives and incubate at 37 ""C overnight. Isolate plasmid DNA fi-om these 

20 cultures and sequence plasmid insert. In some instances where the plates 

used for the initial colony lifts contained non-confluent colonies, a specific 
colony corresponding to a blue spot on the filter could be identified on a 
regenerated plate and repurified directly, instead of using the filter rescue 
technique. 

25 Another example of such an assay is a variation of the high temperature filter assay 
wherein colony-laden filters are heat-killed at different temperatures (for example, 1 05 °C 
for 20 minutes) to monitor thermostability. The 3MM paper is saturated with different 
buffers (i.e., 100 mM NaCl, 5 mM MgCls, 100 mM Tris-Cl (pH 9.5)) to determine 
enzyme activity under different buffer conditions. 

/7 
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A p-glucosidase assay may also be employed, wherein GlcppNp is used as an artificial 
substrate (aryl-p-glucosidase). The increase in absorbance at 405 nm as a result of p- 
nitrophenol (pNp) liberation was followed on a Hitachi U-1100 spectrophotometer, 
equipped with a thermostatted cuvette holder. The assays may be performed at 80°C or 
5 PO^^C in closed 1-ml quartz cuvette. A standard reaction mixture contains 150 mM 
trisodium substrate, pH 5.0 (at 80°C), and 0.95 mM pNp derivative pNp = 0.561 mM"' 
cm '). The reaction mixture is allowed to reach the desired temperature, after which the 
reaction is started by injecting an appropriate amount of enzyme (1 .06 ml final volume). 

1 u p-glucosidase activity is defined as that amount required to catalyze the formation 
10 of 1 .0 /umol pNp/min. D-cellobiose may also be used as a substrate. 

An ONPG assay for p-galactosidase activity is described by Miller, J.H. (1992) A Short 
Course in Bacterial Genetics and Mill, J.H. (1992) Experiments in Molecular Genetics, 
the contents of which are hereby incorporated by reference in their entirety. 

A quantitative fluorometric assay for p-galactosidase specific activity is described by : 
15 Youngman P., (1987) Plasmid Vectors for Recovering and Exploiting Tn917 
Transpositions in Bacillus and other Gram-Positive Bacteria. In Plasmids: A Practical 
approach (ed. K. Hardy) pp 79-103. IRL Press, Oxford. A description of the procedure 
can be found in Miller (1992) p. 75-77, the contents of which are incorporated by 
reference herein in their entirety. 

20 The polynucleotides of the present invention may be in the form of DNA which DNA 
includes cDNA, genomic DNA, and synthetic DNA, The DNA may be double-stranded 
or single-stranded, and if single stranded may be the coding strand or non-coding (anti- 
sense) strand. The coding sequences which encodes the mature enzymes may be 
identical to the coding sequences shown in Figures 1-8 (SEQ ID NOS: 1-14 and 57-60) 



3NSDCX;iD: <WO 9a24799A1 1C> 



SUBSnrOTE sheet (rule 26) 



wo 98/24799 PCT/US97/22623 



or may be a different coding sequence which coding sequence, as a result of the 
redundancy or degeneracy of the genetic code, encodes the same mature enzymes as the 
DNA of Figures 1-18 (SEQ ID NOS: 1-14 and 57-60). 

The polynucleotide which encodes for the mature enzyme of Figures 1-1 8 (SEQ ID NOS: 
5 15-28 and 61-64) may include, but is not limited to: only the coding sequence for the 
mature enzyme; the coding sequence for the mature enzyme and additional coding 
sequence such as a leader sequence or a proprotein sequence; the coding sequence for the 
mature enzyme (and optionally additional coding sequence) and non-coding sequence, 
such as introns or non-coding sequence 5' and/or 3* of the coding sequence for the mature 
1 0 enzyme. 

Thus, the term "polynucleotide encoding an enzyme (protein)" encompasses a 
polynucleotide which includes only coding sequence for the enzyme as well as a 
polynucleotide which includes additional coding and/or non-coding sequence. 

The— presentHnvention— further-reIates-to~variants-of-the-hereinabove— descri^^^^^^ 

1 5 polynucleotides which encode for fragments, analogs and derivatives of the enzymes 
having the deduced amino acid sequences of Figures 1-18 (SEQ ID NOS: 15-28 and 61- 
64). The variant of the polynucleotide may be a naturally occurring allelic variant of the 
polynucleotide or a non-naturally occurring variant of the polynucleotide. 

Thus, the present invention includes polynucleotides encoding the same mature enzymes 
20 as shown in Figures 1-18 (SEQ ID NOS: 15-28 and 61-64) as well as variants of such 
polynucleotides which variants encode for a fragment, derivative or analog of the 
enzymes of Figures 1-18 (SEQ ID NOS: 15-28 and 61-64). Such nucleotide variants 
include deletion variants, substitution variants and addition or insertion variants. 

As hereinabove indicated, the polynucleotides may have a coding sequence which is a 
25 naturally occurring allelic variant of the coding sequences shown in Figures 1-18 (SEQ 
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IDNOS: 1-14 and 57-60). As known in the art, an allelic variant is an alternate form of 
a polynucleotide sequence which may have a substitution, deletion or addition of one or 
more nucleotides, which does not substantially alter the function of the encoded enzyme. 

Fragments of the fiill length gene of the present invention may be used 5s a hybridization 
5 probe for a cDNA or a genomic library to isolate the full length DNA and to isolate other 
DNAs which have a high sequence similarity to the gene or similar biological activity. 
Probes of this type preferably have at least 10, preferably at least 15, and even more 
preferably at least 30 bases and may contain, for example, at least 50 or more bases. The 
probe may also be used to identify a DNA clone corresponding to a full length transcript 
10 and a genomic clone or clones that contain the complete gene including regulatory and 
promotor regions, exons, and introns. An example of a screen comprises isolating the 
coding region of the gene by using the known DNA sequence to synthesize an 
oligonucleotide probe. Labeled oligonucleotides having a sequence complementary to 
that of the gene of the present invention are used to screen a library of genomic DNA to 
1 5 determine which members of the library the probe hybridizes to. 

The present invention further relates to polynucleotides which hybridize to the 
hereinabove-described sequences if there is at least 70%, preferably at least 90%, and 
more preferably at least 95% identity between the sequences. The present invention 
particularly relates to polynucleotides which hybridize under stringent conditions to the 

20 hereinabove-described polynucleotides. As herein used, the term "stringent conditions" 
means hybridization will occur only if there is at least 95% and preferably at least 97% 
identity between the sequences. The polynucleotides which hybridize to the hereinabove 
described polynucleotides in a preferred embodiment encode enzymes which either retain 
substantially the same biological function or activity as the mature enzyme encoded by 

25 the DNA of Figures 1 -1 8 (SEQ ID NOS: 1-14 and 57-60). 

Alternatively, the polynucleotide may have at least 1 5 bases, preferably at least 30 bases, 
and more preferably at least 50 bases which hybridize to any part of a polynucleotide of 
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the present invention and which has an identity thereto, as hereinabove described, and 
which may or may not retain activity. For example, such polynucleotides may be 
employed as probes for the polynucleotides of SEQ ID NOS: 1-14 and 57-60, for 
example, for recovery of the polynucleotide or as a diagnostic probe or as a PGR primer. 

5 Thus, the present invention is directed to polynucleotides having at least a 70% identity, 
preferably at least 90% identity and more preferably at least a 95% identity to a 
polynucleotide which encodes the enzymes of SEQ ID NOS: 15-28 and 61-64 as well as 
fragments thereof, which fragments have at least 15 bases, preferably at least 30 bases 
and most preferably at least 50 bases, which fragments are at least 90% identical, 
10 preferably at least 95% identical and most preferably at least 97% identical under 
stringent conditions to any portion of a polynucleotide of the present invention. 

The present invention further relates to enzymes which have the deduced amino acid 
sequences of Figures 1-18 (SEQ ID NOS: 15-28 and 61-64) as well as fragments, analogs 
and derivatives of such enzyme. 



15 The terms "fragment," "derivative" and "analog" when referring to the enzymes of 
Figures 1-18 (SEQ ID NOS: 15-28 and 61-64) means enzymes which retain essentially 
the same biological function or activity as such enzymes. Thus, an analog includes a 
proprotein which can be activated by cleavage of the proprotein portion to produce an 
active mature enzyme. 

20 The enzymes of the present invention may be a recombinant enzyme, a natural enzyme 
or a synthetic enzyme, preferably a recombinant enzyme. 

The fragment, derivative or analog of the enzymes of Figures 1-18 (SEQ ID NOS: 1 5-28 

and 61-64) may be (i) one in which one or more of the amino acid residues are 

substituted with a conserved or non-conserved amino acid residue (preferably a 

25 conserved amino acid residue) and such substituted amino acid residue may or may not 

Z3 
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be one encoded by the genetic code, or (ii) one in which one or more of the amino acid 
residues includes a substituent group, or (iii) one in which the mature enzyme is fused 
with another compound, such as a compound to increase the half-life of the enzyme (for 
example, polyethylene glycol), or (iv) one in which the additional amino acids are fused 
5 to the mature enzyme, such as a leader or secretory sequence or a sequence which is 
employed for purification of the mature enzyme or a proprotein sequence. Such 
fragments, derivatives and analogs are deemed to be within the scope of those skilled in 
the art from the teachings herein. 

The enzymes and polynucleotides of the present invention are preferably provided in an 
1 0 isolated form, and preferably are purified to homogeneity. 

The term "isolated" means that the material is removed from its original environment 
(e.g., the natural environment if it is naturally occurring). For example, a naturally- 
occurring polynucleotide or enzyme present in a living animal is not isolated, but the 
same polynucleotide or enzyme, separated from some or all of the coexisting materials 
1 5 in the natural system, is isolated. Such polynucleotides could be part of a vector and/or 
such polynucleotides or enzymes could be part of a composition, and still be isolated in 
that such vector or composition is not part of its natural environment. 

The enzymes of the present invention include the enzymes of SEQ ID NOS: 1 5-28 and 
61-64 (in particular the mature enzyme) as well as enzymes which have at least 70% 

20 similarity (preferably at least 70% identity) to the enzymes of SEQ ID NOS : 1 5-28 and 
61 -64 and more preferably at least 90% similarity (more preferably at least 90% identity) 
to the enzymes of SEQ ID NOS: 15-28 and 61-64 and still more preferably at least 95% 
similarity (still more preferably at least 95% identity) to the enzymes of SEQ ID NOS: 
15-28 and 61-64 and also include portions of such enzymes with such portion of the 

25 enzyme generally containing at least 30 amino acids and more preferably at least 50 
amino acids. 
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As known in the art "similarity" between two enzymes is determined by comparing the 
amino acid sequence and its conserved amino acid substitutes of one enzyme to the 
sequence of a second enzyme. 

A variant, i.e. a "fragment", "analog" or "derivative" polypeptide, and reference 
5 polypeptide may differ in amino acid sequence by one or more substitutions, additions, 
deletions, fusions and truncations, which may be present in any combination. 

Among preferred variants are those that vary from a reference by conservative amino 
acid substitutions. Such substitutions are those that substitute a given amino acid in a 
polypeptide by another amino acid of like characteristics. Typically seen as conservative 
1 0 substitutions are the replacements, one for another, among the aliphatic amino acids Ala, 
Val, Leu and He; interchange of the hydroxyl residues Ser and Thr, exchange of the 
acidic residues Asp and Glu, substitution between the amide residues Asn and Gin, 
exchange of the basic residues Lys and Arg and replacements among the aromatic 
residues Phe, Tyr. 



1 5 Most highly prefen-ed are variants which retain the same biological function and activity 
as the reference polypeptide from which it varies. 

Fragments or portions of the enzymes of the present invention may be employed for 
producing the corresponding full-length enzyme by peptide synthesis; therefore, the 
fragments may be employed as intermediates for producing the full-length enzymes. 
20 Fragments or portions of the polynucleotides of the present invention may be used to 
synthesize full-length polynucleotides of the present invention. 

The present invention also relates to vectors which include polynucleotides of the present 
invention, host cells which are genetically engineered with vectors of the invention and 
the production of enzymes of the invention by recombinant techniques. 
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Host cells are genetically engineered (transduced or transformed or transfected) with the 
vectors of this invention which may be, for example, a cloning vector or an expression 
vector. The vector may be, for example, in the form of a plasmid, a viral particle, a 
phage, etc. The engineered host cells can be cultured in conventional nutrient media 
5 modified as appropriate for activating promoters, selecting transformants or amplifying 
the genes of the present invention. The culture conditions, such as temperature, pH and 
the like, are those previously used with the host cell selected for expression, and will be 
apparent to the ordinarily skilled artisan. 

The polynucleotides of the present invention may be employed for producing enzymes 
10 by recombinant techniques. Thus, for example, the polynucleotide may be included in 
any one of a variety of expression vectors for expressing an enzyme. Such vectors 
include chromosomal, nonchromosomal and synthetic DNA sequences, e.g., derivatives 
of SV40; bacterial plasmids; phage DNA; baculoviras; yeast plasmids; vectors derived 
from combinations of plasmids and phage DNA, viral DNA such as vaccinia, adenovirus, 
1 5 fowl pox vims, and pseudorabies. However, any other vector may be used as long as it 
is replicable and viable in the host. 

The appropriate DNA sequence may be inserted into the vector by a variety of 
procedures. In general, the DNA sequence is inserted into an appropriate restriction 
endonuclease site(s) by procedures known in the art. Such procedures and others are 
20 deemed to be within the scope of those skilled in the art. 

The DNA sequence in the expression vector is operatively linked to an appropriate 
expression control sequence(s) (promoter) to direct mRNA synthesis. As representative 
examples of such promoters, there may be nientioned: LTR or SV40 promoter, the R 
coli. lac or trp, the phage lambda promoter and other promoters known to control 
25 expression of genes in prokaryotic or eukaryotic cells or their virases. The expression 
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vector also contains a ribosome binding site for translation initiation and a transcription 
terminator. The vector may also include appropriate sequences for amplifying 
expression. 

In addition, the expression vectors preferably contain one or more- selectable marker 
5 genes to provide a phenotypic trait for selection of transformed host cells such as 
dihydrofolate reductase or neomycin resistance for eukaryotic cell culture, or such as 
tetracycline or ampicillin resistance in E. coli . 

The vector containing the appropriate DNA sequence as hereinabove described, as well 
as an appropriate promoter or control sequence, may be employed to transform an 
1 0 appropriate host to permit the host to express the protein. 

As representative examples of appropriate hosts, there may be mentioned: bacterial cells, 
such as E, coli , Streptomyces , Bacillus subtilis : fungal cells, such as yeast; insect cells 
such as Drosophila S2 and Spodoptera Sf9; animal cells such as CHO, COS or Bowes 

melanoma; adenoviruses; plant celisretcr^eselectionofanappropriatehc^^^ 

15 to be within the scope of those skilled in the art from the teachings herein. 

More particularly, the present invention also includes recombinant constructs comprising 
one or more of the sequences as broadly described above. The constructs comprise a 
vector, such as a plasmid or viral vector, into which a sequence of the invention has been 
inserted, in a forward or reverse orientation. In a preferred aspect of this embodiment, 

20 the construct further comprises regulatory sequences, including, for example, a promoter, 
operably linked to the sequence. Large numbers of suitable vectors and promoters are 
known to those of skill in the art, and are conmiercially available. The following vectors 
are provided by way of example; Bacterial: pQE70, pQE60, pQE-9 (Qiagen), pDlO, 
psiX174, pBluescript H KS, pNH8A, pNHl 6a, pNHl 8A, pNH46A (Stratagene); ptrc99a, 

25 pKK223-3, pKK233-3, pDR540, pRIT5 (Pharmacia); Eukaryotic: pSV2CAT, pOG44, 

z:7 
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pXTl , pSG (Stratagene) pSVK3, pBPV, pMSG, pSVL (Pharmacia). However, any other 
plasmid or vector may be used as long as they are replicable and viable in the host. 

Promoter regions can be selected from any desired gene using CAT (chloramphenicol 
transferase) vectors or other vectors with selectable markers. Two appropriate vectors 
5 are pKK232-8 and pCM7. Particular named bacterial promoters include lad, lacZ, T3, 
T7, gpt, lambda Pr, Pl and trp: Eukaryotic promoters include CMV immediate early, 
HSV thymidine kinase, early and late SV40, LTRs from retrovirus, and mouse 
metallothionein-1. Selection of the appropriate vector and promoter is well within the 
level of ordinary skill in the art. 

10 hi a fiirther embodiment, the present invention relates to host cells containing the above- 
described constructs. The host cell can be a higher eukaryotic cell, such as a mammalian 
cell, or a lower eukaryotic cell, such as a yeast cell, or the host cell can be a prokaryotic 
cell, such as a bacterial cell, hitroduction of the construct into the host cell can be 
effected by calcium phosphate transfection, DEAE-Dextran mediated transfection, or 

1 5 electroporation (Davis, L., Dibner, M., Battey, I., Basic Methods in Molecular Biology, 
(1986)). 

The constructs in host cells can be used in a conventional manner to produce the gene 
product encoded by the recombinant sequence. Alternatively, the enzymes of the 
invention can be synthetically produced by conventional peptide synthesizers. 

20 Mature proteins can be expressed in mammalian cells, yeast, bacteria, or other cells 
under the control of appropriate promoters. Cell-fi«e translation systems can also be 
employed to produce such proteins using RNAs derived from the DMA constructs of the 
present invention. Appropriate cloning and expression vectors for use with prokaryotic 
and eukaryotic hosts are described by Sambrook, et al.. Molecular Cloning: A Laboratory 

25 Manual, Second Edition. Cold Spring Harbor, N.Y., (1989), the disclosure of which is 
hereby incorporated by reference. 
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Transcription of the DNA encoding the enzymes of the present invention by higher 
eukaiyotes is increased by inserting an enhancer sequence into the vector. Enhancers are 
cis-acting elements of DNA, usually about from 1 0 to 300 bp that act on a promoter to 
increase its transcription. Examples include the SV40 enhancer on the late side of the 
5 replication origin bp 100 to 270, a cytomegalovirus early promoter enhancer, the 
polyoma enhancer on the late side of the replication origin, and adenovirus enhancers. 

Generally, recombinant expression vectors will include origins of replication and 
selectable markers permitting transformation of the host cell, e.g., the ampicillin 
resistance gene of E. coli and S. cerevisiae TRPl gene, and a promoter derived from a 

10 highly-expressed gene to direct transcription of a downstream structural sequence. Such 
promoters can be derived from operons encoding glycolytic enzymes such as 3- 
phosphoglycerate kinase (PGK), a-factor, acid phosphatase, or heat shock proteins, 
among others. The heterologous structural sequence is assembled in appropriate phase 
with translation initiation and termination sequences, and preferably, a leader sequence 

15 capable of directing secretion of translated enzyme. Optionally, the heterologous 
sequence can encoded fusion enzyme including an N-terminal identification^ept^e^ 
imparting desired characteristics, e.g., stabilization or simplified purification of 
expressed recombinant product. 

Useful expression vectors for bacterial use are constructed by inserting a structural DNA 
20 sequence encoding a desired protein together with suitable translation initiation and 
termination signals in operable reading phase with a functional promoter. The vector will 
comprise one or more phenotypic selectable markers and an origin of replication to 
ensure maintenance of the vector and to, if desirable, provide amplification within the 
host. Suitable prokaryotic hosts for transformation include E. coli . Bacillus subtilis . 
25 Salmonella typhimurium and various species within the genera Pseudomonas, 
Streptomyces, and Staphylococcus, although others may also be employed as a matter 
of choice. 

^9 
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As a representative but nonlimiting example, useful expression vectors for bacterial use 

can comprise a selectable marker and bacterial origin of replication derived from 
" commercially available plasmids comprising genetic elements of the well known cloning 

vector pBR322 (ATCC 37017). Such commercial vectors include, for example, 
5 pKK223-3 (Pharmacia Fine Chemicals, Uppsala, Sweden) and GEM 1 (Promega Biotec, 

Madison, WI, USA). These pBR322 "backbone" sections are combined with an 

appropriate promoter and the structural sequence to be expressed. 

Following transformation of a suitable host strain and growth of the host strain to an 
appropriate cell density, the selected promoter is induced by appropriate means (e.g., 
1 0 temperature shift or chemical induction) and cells are cuhured for an additional period. 

Cells are typically harvested by centrifugation, disrupted by physical or chemical means, 
and the resulting crude extract retained for further purification. 

Microbial cells employed in expression of proteins can be disrupted by any convenient 
method, including freeze-thaw cycling, sonication, mechanical disruption, or use of cell 
1 5 lysing agents, such methods are well known to those skilled in the art. 

Various mammalian cell culture systems can also be employed to express recombinant 
protein. Examples of mammalian expression systems include the COS-7 lines of monkey 
kidney fibroblasts, described by Gluzman, Cell, 23:175 (1981), and other cell lines 
capable of expressing a compatible vector, for example, the CI 27, 3T3, CHO, HeLa and 

20 BHK cell lines. Mammalian expression vectors will comprise an origin of replication, 
a suitable promoter and enhancer, and also any necessary ribosome binding sites, 
polyadenylation site, splice donor and acceptor sites, transcriptional temiination 
sequences, and 5' flanking nontranscribed sequences. DNA sequences derived from the 
SV40 splice, and polyadenylation sites may be used to provide the required 

25 nontranscribed genetic elements. 
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The enzyme can be recovered and purified from recombinant cell cultures by methods 
including ammonium sulfate or ethanol precipitation, acid extraction, anion or cation 
exchange chromatography, phosphocellulose chromatography, hydrophobic interaction 
chromatography, affinity chromatography, hydroxylapatite chromatography and lectin 
5 chromatography. Protein refolding steps can be used, as necessary, in completing 
configuration of the mature protein. Finally, high performance liquid chromatography 
(HPLC) can be employed for final purification steps. 

The enzymes of the present invention may be a naturally purified product, or a product 
of chemical synthetic procedures, or produced by recombinant techniques from a 
1 0 prokaryofic or eukaryofic host (for example, by bacterial, yeast, higher plant, insect and 
mammalian cells in culture). Depending upon the host employed in a recombinant 
production procedure, the enzymes of the present invention may be glycosylated or may 
be non-glycosylated. Enzymes of the invention may or may not also include an initial 
methionine amino acid residue. 

15 P^^actosidase Hydrolyzes lactose to galactose and"glucose. Accordingly, the OC1/4V, 
9N2-31B/G, AEDU12RA-18B/G and F1-12G enzymes may be employed in the food 
processing industry for the production of low lactose content milk and for the production 
of galactose or glucose from lactose contained in w^hey obtained in a large amount as a 
by-product in the production of cheese. Generally, it is desired that enzymes used in 

20 food processing, such as the aforementioned p-galactosidases, be stable at elevated 
temperatures to help prevent microbial contamination. 



These enzymes may also be employed in the pharmaceutical industry. The enzymes are 
used to treat intolerance to lactose, hi this case, a thermostable enzyme is desired, as 
well. Thermostable P-galactosidases also have uses in diagnostic applications, where 
25 they are employed as reporter molecules. 

■5/ 
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Glucosidases act on soluble cellooligosaccharides from the non-reducing end to give 
glucose as the sole product. Glucanases (endo- and exo-) act in the depolymerization of 
cellulose, generating more non-reducing ends (endo-glucanases, for instance, act on 
internal linkages yielding cellobiose, glucose and cellooligosaccharides as products). P- 

5 glucosidases are used in applications where glucose is the desired product. Accordingly, 
MllTL, F1-12G, GC74-22G, MSB8-6G , OC1/4V, VC1-7G1, 9N2-31B/G and 
AEDI112RA18B/G may be employed in a wide variety of industrial applications, 
including in com wet milling for the separation of starch and gluten, in the fruit industry 
for clarification and equipment maintenance, in baking for viscosity reduction, in the 

10 textile industry for the processing of blue jeans, and in the detergent industry as an 
additive. For these and other applications, thermostable enzymes are desirable. 

Antibodies generated against the enzymes con-esponding to a sequence of the present 
invention can be obtained by direct injection of the enzymes into an animal or by 
administering the enzymes to an animal, preferably a nonhuman. The antibody so 
1 5 obtained will then bind the enzymes itself hi this manner, even a sequence encoding 
only a fragment of the enzymes can be used to generate antibodies binding the whole 
native enzymes. Such antibodies can then be used to isolate the enzyme from cells 
expressing that enzyme. 

For preparation of monoclonal antibodies, any technique which provides antibodies 
20 produced by continuous cell line cultures can be used. Examples include the hybridoma 
technique (Kohler and Milstein, 1975, Nature, 256:495-497), the trioma technique, the 
human B-cell hybridoma technique (Kozbor et aL, 1983, hnmunology Today 4:72), and 
the EBV-hybridoma technique to produce human monoclonal antibodies (Cole, et al., 
1985, in Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). 
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Techniques described for the production of single chain antibodies (U.S. Patent 
4,946,778) can be adapted to produce single chain antibodies to immunogenic enzyme 
products of this invention. Also, transgenic mice may be used to express humanized 
antibodies to immunogenic enzyme products of this invention. 

5 Antibodies generated against the enzyme of the present invention may be used in 
screening for similar enzymes from other organisms and samples. Such screening 
techniques are known in the art, for example, one such screening assay is described in 
"Methods for Measuring Cellulase Activities", Methods in enzymology. Vol 160, pp. 87- 
116, which is hereby incorporated by reference in its entirety. 

1 0 The present invention will be further described with reference to the following examples; 
however, it is to be understood that the present invention is not limited to such examples. 
All parts or amounts, unless otherwise specified, are by weight. 

In order to facilitate understanding of the following examples certain frequently 
occurring methods and/or terms will be described. 

1 5 "Plasmids" are designated by a lower case p preceded and/or followed by capital letters 
and/or numbers. The starting plasmids herein are either commercially available, publicly 
available on an unrestricted basis, or can be constructed from available plasmids in 
accord with published procedures, hi addition, equivalent plasmids to those described 
are known in the art and will be apparent to the ordinarily skilled artisan. 

20 "Digestion" of DNA refers to catalytic cleavage of the DNA with a restriction enzyme 
that acts only at certain sequences in the DNA. The various restriction enzymes used 
herein are commercially available and their reaction conditions, cofactors and other 
requirements were used as would be known to the ordinarily skilled artisan. For 
analytical purposes, typically 1 ng of plasmid or DNA fragment is used with about 2 

25 units of enzyme in about 20 ^il of buffer solution. For the purpose of isolating DNA 

3> 
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fragments for plasmid construction, typically 5 to 50 of DNA are digested with 20 to 
250 units of enzyme in a larger volume. Appropriate buffers and substrate amounts for 
particular restriction enzymes are specified by the manufacturer. Licubation times of 
about 1 hour at 37°C are ordinarily used, but may vary in accordance with the supplier's 
5 instructions. After digestion the reaction is electrophoresed directly oh a polyacrylamide 
gel to isolate the desired fragment. 

Size separation of the cleaved fragments is performed using 8 percent polyacrylamide 
gel described by Goeddel, D. et a/.. Nucleic Acids Res., 8:4057 (1980). 

"Oligonucleotides" refers to either a single stranded polydeoxynucleotide or two 
10 complementary polydeoxynucleotide strands which may be chemically synthesized. 
Such synthetic oligonucleotides have no 5' phosphate and thus will not ligate to another 
oligonucleotide without adding a phosphate with an ATP in the presence of a kinase. A 
synthetic oligonucleotide will ligate to a fragment that has not been dephosphorylated. 

"Ligation" refers to the process of forming phosphodiester bonds between two double 
15 stranded nucleic acid fragments (Maniatis, T., et al.. Id., p. 146). Unless otherwise 
provided, ligation may be accomplished using knovm buffers and conditions with 10 
units of T4 DNA ligase ("ligase") per 0.5 ^ig of approximately equimolar amounts of the 
DNA fragments to be ligated. 

Unless otherwise stated, transformation was performed as described in the method of 
20 Graham, F. and Van der Eb, A., Virology, 52:456-457 (1 973). 

Example 1 

Bacterial Expression and Purification of Glvcosidase Enzymes 
DNA encoding the enzymes of the present invention, SEQ ID NOS : 1 - 1 4 and 57-60 were 
initially amplified fi-om a pBluescript vector containing the DNA by the PCR technique 
25 using the primers noted herein. The amplified sequences were then inserted into the 
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respective PQE vector listed beneath the primer sequences, and the enzyme was 
expressed according to the protocols set forth herein. The 5' and 3* primer sequences for 
the respective genes are as follows: 

Thermococcus AEDllURA -18B/G 

5 5' CCGAGAATTCATrAAAGAGGAGAAATTAACTATGGTGAATGCTATGATTGTC 3' (SEQ ID NO:29) 
3' CGGAAGATCTTCATAGCTCCGGAAGCCCATA 5' (SEQ JD NO:30) 

Vector: pQE12; and contains the following restriction enzyme sites 5' EcoRI and 3' 
Big 11. 

OC1/4V-33B/G 

10 5' CCGAGAATTCATTAAAGAGGAGAAATTAACTATGATAAGAAGGTCCGATTTTCC 3' 
(SEQIDNO:31) 

3' CGGAAGATCTTTAAGATTTTAGAAATTCCTT 5' (SEQ ID NO:32) 

Vector: pQE12; and contains the following restriction enzyme sites 5' EcoRI and 3* 

Bgi n. 

— 15- -TheKmococcus-9N2--3-\B/G 

5* CCGAGAATTCATTAAAGAGGAGAAATTAACTATGCTACCAGAAGGCTTTCTC 3* 
(SEQ ID NO:33) 

3* CGGAGGTACCTCACCCAAGTCCGAACrrCTC 5' (SEQ ID NO:34) 

Vector: pQE30; and contains the following restriction enzyme sites 5* EcoRI and 3' 
20 KpnI. 

Staphylothermus marinus Fl - 12G 

5* CCGAGAATTCATTAAAGAGGAGAAATTAACTATGATAAGGTTTCCTGATTAT 3" 
(SEQ IDNO:35) 

3' CGGAAGATCTTTATTCGAGGTTCTTTAATCC 5' (SEQ ID NO:36) 

25 Vector: pQE12; and contains the following restriction enzyme sites 5' EcoRI and 3* 
Bgl n, 

Thermococcus chitonophagus GC74 - 22G 

5* CCGAGAATTCATTCATTAAAGAGGAGAAATTAACTATGCTFCCAGGAGAACTTTC^ 3* 
(SEQIDNO:37) 



BNSCOCID: <WQ_9824799A1 JC> 



SUBSTITUTE SHEET(RULE 26) 



wo 98/24799 



PCTAJS97/22623 



3' CGGAGGATCCCTACCCCTCCTCTAAGATCTC 5' (SEQ ID NO:38) 

Vector: pQE12; and contains the following restriction enzyme sites 5* EcoRI and 3' 
BamHI. 

MllTL 

5 y AATAATCTAGAGCATGCAATTCCCCAAAGACTTCATGATAG 3' (SEQ ID NO:39) 
3' AATAAAAGCTTACTGGATCAGTGTAAGATGCT 5* (SEQ ID NO:40) 

Vector: pQE70; and contains the following restriction enzyme sites 5* SphI and 3' 
Hind III. 

Thermotoga maritima MSB8-6G 

10 5' CCGACAATTGATTAAAGAGGAGAAATTAACTATGGAAAGGATCGATGAAATT 3' (SEQ ID NO:4 1) 
3' CGGAGGTACCTCATGGTTTGAATCTCTTCTC 5' (SEQ ID NO:42) 

Vector: pQEl 2; and contains the following restriction enzyme sites 5' EcoRI and 3' 
KpnI. 

Pyrococcus furiosus VCl - 7G1 

1 5 5' CCGACAATTGATTAAAGAGGAGAAATTAACTATGTTCCCTGAAAAGTTCCTT 3' (SEQ ID NO:43) 
3' CGGAGGTACCTCATCCCCTCAGCAATrCCTC 5' (SEQ ID NO:44) 

Vector: pQE12; and contains the following restriction enzyme sites 5* EcoRI and 3' 
Kpn I. 

Bankia gouldi endoglucanase (37GP1) 

20 5' AATAAGGATCCGTTTAGCGACGCTCGC 3' (SEQ ID NO:45) 

y AATAAAAGCTTCCGGGTTGTACAGCGGTAATAGGC 5' (SEQ ID >JO:46) 

Vector: pQE52; and contains the following restriction enzyme sites 5* Bam HI and 3' 
Hind m. 
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Thermotoga maritima a-galactosidase (6GC2) 

S'TrTATrGAATrCATTAAAGAGGAGAAATTAACTATGATCTGTGTGGAAATATrCGGAAAGS- 
- (SEQ ID NO;47) 

3- TCTATAAAGCnTCAlTCTCTCTCACCCTCTrCGTAGAAG 5' (SEQ ID NO:48) 

5 Vector: pQET; and contains the following restriction enzyme sites 5' EcoRI and 3' 
Hind m. 



Thermotoga maritima B-mannanase (6GP2) 

5- TITAITCAATTGATTAAAGAGGAGAAATTAACTATGGGGATTGGTGGCGACGAC 3' 
(SEQIDNO:49) 

1 0 3' •nTATTAAGCTTATCmTCATATTCACATACCTCC 5' (SEQ ID NO:50) 

Vector: pQEt; and contains the following restriction enzyme sites 5' Hind III and 3' 
EcoRI. 



AEPIIla fi-mannanase(63GBl) 

5- TTrATrGAATTCATrAAAGAGGAGAAATrAACTATGCTACCAGAAGAGTTCCTATGGGGC 3' 
15 (SEQlDNO:5I) 

3'TTrATTAAGCrrCTCATCAACGGCTATGGTCTrCA7TrC 5' (SEQ ID NO:52) 

Vector^pQEl; and_c 

EcoRI. 



OC7//F endogIucanase(33GPl) 
20 5- 

AAAAAACAATTGAATrCATTAAAGAGGAGAAATrAACTATGCTAGAAAGACACrrCAGATATGTrCT 
T3' (SEQlDNO:53) 

3- TTTrrCGGATCCAATrCTTCATrrACTCTTTGCCTG 5" (SEQ ID NO:54) 

Vector: pQEt; and contains the following restriction enzyme sites 5' BamHI and 3' 
25 EcoRI. 



Thermotoga maritima pullalanase (6GP3) 



5'TTTTC 



rGGAATTCATTAAAGAGGAGAAA1TAACTATGGAACTGATCATAGAAGGTrAC3' 
(SEQIDNO:55) 
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3- ATAAGAAGCmrCACTCTCTGTACAGAACGTACGC 5' (SEQ ID NO:56) 

Vector: pQEt; and contains the following restriction enzyme sites 5' EcoRI and 3' 
Hind III. 

The restriction enzyme sites indicated correspond to the restriction enzyme sites on the 
5 bacterial expression vector indicated for the respective gene (Qiagen, Inc. Chatsworth, 
CA). The pQE vector encodes antibiotic resistance (Amp'), a bacterial origin of 
replication (ori), an PTG-regulatable promoter operator (P/O), a ribosome binding site 
(RBS), a 6-His tag and restriction enzyme sites. 

The pQE vector was digested with the restriction enzymes indicated. The amplified 
10 sequences were ligated into the respective pQE vector and inserted in frame with the 
sequence encoding for the RBS. The ligation mixture was then used to transform the E. 
coli strain M15/pREP4 (Qiagen, Inc.) by electroporation. Ml 5/pREP4 contains multiple 
copies of the plasmid pREP4, which expresses the lad repressor and also confers 
kanamycin resistance (KanO- Transformants were identified by their ability to grow on 
1 5 LB plates and ampicillin/kanamycin resistant colonies were selected. Plasmid DNA was 
isolated and confirmed by restriction analysis. Clones containing the desired constructs 
were grown overnight (O/N) in liquid culture in LB media supplemented with both Amp 
(100 ug/ml) and Kan (25 ug/ml). The O/N culture was used to inoculate a large culture 
at a ratio of 1 : 1 00 to 1 :250. The cells were grown to an optical density 600 (O.D."^ of 
20 between 0.4 and 0.6. IPTG ("Isopropyl-B-D-thiogalacto pyranoside") was then added 
to a final concentration of 1 mM. IPTG induces by inactivating the lad repressor, 
clearing the P/O leading to increased gene expression. Cells were grown an extra 3 to 
4 hours. Cells were then harvested by centrifiigation. 

The primer sequences set out above may also be employed to isolate the target gene from 
25 the deposited material by hybridization techniques described above. 
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Example 2 

Isolation of A Sel ected Clone From the Deposited genomic clones 

A clone is isolated directly by screening the deposited material using the 
oligonucleotide primers set forth in Example 1 for the particular gene desired to be 
5 isolated. The specific oligonucleotides are synthesized using an Applied Biosystems 
DNA synthesizer. The oligonucleotides are labeled with ^^P- -ATP using T4 
polynucleotide kinase and purified according to a standard protocol (Maniatis et al.. 
Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Press, Cold Spring, 
NY, 1982). The deposited clones in the pBluescript vectors may be employed to 

1 0 transform bacterial hosts which are then plated on 1 .5% agar plates to the density of 
20,000-50,000 pfu/1 50 mm plate. These plates are screened using Nylon membranes 
according to the standard screening protocol (Stratagene, 1 993). Specifically, the 
Nylon membrane with denatured and fixed DNA is prehybridized in 6 x SSC, 20 mM 
NaHjPO^, 0.4%SDS, 5 x Denhardfs 500 \ig/ml denatured, sonicated salmon sperm 

1 5 DNA; and 6 X SSC, 0. 1 % SDS. After one hour of prehybridization, the membrane is 
hybridized with hybridization buffer 6xSSC, 20 mM NaHjPO^, 0.4%SDS, 500 ug/ml 
denaturedrsonicated saimoirspem DNArwitrixlO* cpm/mf^^P-probe overnight at 
42°C. The membrane is washed at 45-50°C with washing buffer 6 x SSC, 0.1% SDS 
for 20-30 minutes dried and exposed to Kodak X-ray film ovemight. Positive clones 

20 are isolated and purified by secondary and tertiary screening. The purified clone is 
sequenced to verify its identity to the primer sequence. 



Once the clone is isolated,;^ two oligonucleotide primers corresponding to the gene 
of interest are used to amplify the gene fi-om the deposited material. A polymerase 
chain reaction is carried out in 25 ^1 of reaction mixture with 0.5 ug of the DNA of 
25 the gene of interest. The reaction mixture is 1 .5-5 mM MgCl^, 0.0 1 % (w/v) gelatin, 
20 ^M each of dATP, dCTP, dGTP, dlTP, 25 pmol of each primer and 0.25 Unit of 
Taq polymerase. Thirty five cycles of PCR (denaturation at 94°C for 1 min; 
annealing at 55 °C for 1 min; elongation at 72°C for 1 min) are performed with the 
Perkin-Elmer Cetus automated thermal cycler. The amplified product is analyzed by 

-39 
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agarose gel electrophoresis and the DNA band with expected molecular weight is 
excised and purified. The PGR product is verified to be the gene of interest by 

- subcloning and sequencing the DNA product. The ends of the newly purified genes 
are nucleotide sequenced to identify full length sequences. Complete sequencing of 

5 full length genes is then performed by Exonuclease III digestion or primer walking. 

Example 3 
Screening for Galactosidase Activitv 

Screening procedures for a-galactosidase protein activity may be assayed for as 
follows: 

J 0 Substrate plates were provided by a standard plating procedure. Dilute 

XLl-Blue MRF E coli host of (Stratagene Cloning Systems, La JoUa, CA) to O.D.goo 
= 1 .0 with NZY media. In 15 ml tubes, inoculate 200 /^l diluted host cells with phage. 
Mix gently and incubate tubes at 37 °C for 15 min. Add approximately 3.5 ml LB top 
agarose (0.7%) containing ImM IPTG to each tube and pour onto all NYZ plate 
1 5 surface. Allow to cool and incubate at 37 ° C overnight. The assay plates are 

obtained as substrate p-Nitrophenyl a-galactosidase (Sigma) (200 mg/100 ml) (100 
mM NaCl, 1 00 mM Potassium-Phosphate) 1 % (w/v) agarose. The plaques are 
overiayed with nitrocellulose and incubated at 4 °C for 30 minutes whereupon the 
nitrocellulose is removed and overiayed onto the substrate plates. The substrate 
20 plates are then incubated at 70 °C for 20 minutes. 

Example 4 

Screenin g of Clones for Mannanase Activitv 

A solid phase screening assay was utilized as a primary screening method to test 
clones for B-maimanase activity. 

25 A culture solution of the Y1090-£. coli host strain (Stratagene Cloning Systems, La 
Jolla, CA) was diluted to O.D.^oo^l 0 with NZY media. The amplified library from 
Thermotoga maritima lambda gtll library was diluted in SM (phage dilution buffer): 
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5 X 10^ pfu/^il diluted 1:1000 then 1:100 to 5 x 10^ pfu/^iL Then 8 ^il of phage 
dilution (5x10^ pfu/jil) was plated in 200 ^1 host cells. They were then incubated in 
1 5 ml tubes at 37 °C for 1 5 minutes. 



Approximately 4 ml of molten, LB top agarose (0.7%) at approximately 52 ""C was 
5 added to each tube and the mixture was poured onto the surface of LB agar plates. 
The agar plates were then incubated at 37 °C for five hours. The plates were 
replicated and induced with 10 mM IPTG-soaked Duralon-UV™ nylon membranes 
(Stratagene Cloning Systems, La Jolla, CA) overnight. The nylon membranes and 
plates were marked with a needle to keep their orientation and the nylon membranes 
1 0 were then removed and stored at 4 "^C. 



An Azo-galactomannan overlay was applied to the LB plates containing the lambda 
plaques. The overlay contains 1% agarose, 50 mM potassium-phosphate buffer pH 7, 
0.4% Azocarob-galactomannan. (Megazyme, Australia). The plates were incubated 
at 72 ""C, The Azocarob-galactomannan treated plates were observed after 4 hours 

— ~1 5 —then retumed to incubation ovemight—Putative positives were identified by clearing 

zones on the Azocarob-galactomannan plates. Two positive clones were observed. 



The nylon membranes referred to above, which correspond to the positive clones 
were retrieved, oriented over the plate and the portions matching the locations of the 
clearing zones for positive clones wre cut out. Phage was eluted from the membrane 
20 cut-out portions by soaking the individual portions in 500 |al SM (phage dilution 
buffer) and 25 ulCHCla. 



Example 5 

Screening of Clones for Mannosidase Activity 

A solid phase screening assay was utilized as a primary screening method to test 
25 clones for B-mannosidase activity. 
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A culture solution of the Y1090-£. coli host strain (Stratagene Cloning Systems, La 
Jolla, CA) was diluted to O.D.6oo=l 0 with NZY media. The amplified library from 
AEPII la lambda gtll library was diluted in SM (phage dilution buffer): 5x10^ 
pfu/^l diluted 1 : 1 000 then 1 : 1 00 to 5 x 1 0^ pfW^l. Then 8 ^l of phage dilution 
5 (5x10^ pfu/^1) was plated in 200 |il host cells. They were then incubated in 1 5 ml 
tubes at 37 °C for 1 5 minutes. 

Approximately 4 ml of molten, LB top agarose (0.7%) at approximately 52 °C was 
added to each tube and the mixture was poured onto the surface of LB agar plates. 
The agar plates were then incubated at 37 ""C for five hours. The plates were 
1 0 replicated and induced with 1 0 mM IPTG-soaked Duralon-UV™ nylon membranes 
(Stratagene Cloning Systems, La Jolla, CA) overnight. The nylon membranes and 
plates were marked with a needle to keep their orientation and the nylon membranes 
were then removed and stored at 4 ^'C. 

A p-nitrophenyl-6-D-manno-pyranoside overlay was applied to the LB plates 
1 5 containing the lambda plaques. The overlay contains 1% agarose, 50 mM potassium- 
phosphate buffer pH 7, 0.4% p-nitrophenyl-6-D-manno-pyranoside. (Megazyme, 
Australia). The plates were incubated at 72 °C. The p-nitrophenyl-B-D-manno- 
pyranoside treated plates were observed after 4 hours then returned to incubation 
overnight. Putative positives were identified by clearing zones on the p-nitrophenyl- 
20 fl-D-manno-pyranoside plates. Two positive clones were observed. 

The nylon membranes referred to above, which correspond to the positive clones 
were retrieved, oriented over the plate and the portions matching the locations of the 
clearing zones for positive clones wre cut out. Phage was eluted from the membrane 
cut-out portions by soaking the individual portions in 500 \x\ SM (phage dilution 
25 buffer) and 25 pi CHCI3. 
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Example 6 
Screening for Pullulanase Activity 

Screening procedures for pullulanase protein activity may be assayed for as follows: 
Substrate plates were provided by a standard plating procedure. Host cells 
5 are diluted to O.D.^oo - 1 0 with NZY or appropriate media. In 1 5 ml tubes, inoculate 
200 jiA diluted host cells with phage. Mix gently and incubate tubes at 37 for 1 5 
min. Add approximately 3.5 ml LB top agarose (0.7%) is added to each tube and the 
mixture is plated, allowed to cool, and incubated at 37 °C for about 28 hours. 
Overlays of 4.5 mis of the following substrate are poured: 

10 100 ml total volume 



0.5g Red Pullulan Red (Megazyme, Australia) 

1 .Og Agarose 

5ml Buffer (Tris-HCL pH 7.2 @ 75 X) 

2ml 5MNaCl 
15 5ml CaCl2(100mM) 

^85ml dH20 
Plates are cooled at room temperature, and thenm incubated at 75 °C for 2 hours. 
Positives are observed as showing substrate degradation. 

Example 7 

20 Screening for Endo^lucanase Activity 



Screening procedures for endoglucanase protein activity may be assayed for as 
follows: 

1 . The gene library is plated onto 6 LB/GelRite/0. 1 % CMC/NZY agar plates 

25 (-4,800 plaque forming units/plate) in Exoli host with LB agarose as top agarose. 
The plates are incubated at 37°C ovemight. 

2. Plates are chilled at 4°C for one hour. 

3. The plates are overlayed with Duralon membranes (Stratagene) at 
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room temperature for one hour and the membranes are oriented and lifted off the 
plates and stored at A^'C, 

4. The top agarose layer is removed and plates are incubated at 37 °C 
for ~3 hours. 

5 5. The plate surface is rinsed with NaCL 

6. The plate is stained with 0. 1 % Congo Red for 1 5 minutes. 

7. The plate is destained with IM NaCl. 

8. The putative positives identified on plate are isolated from the 
Duralon membrane (positives are identified by clearing zones around clones). The 

1 0 phage is eluted from the membrane by incubating in 500^1 SM + 25 nl CHCI3 to elute. 

9. hisert DNA is subcloned into any appropriate cloning vector and 
subclones are reassayed for CMCase activity using the following protocol: 

i) Spin 1ml overnight miniprep of clone at maximxmi speed 

for 3 minutes. 

1 5 ii) Decant the supernatant and use it to fill "wells" that have 

been made in an LB/GelRite/0.1% CMC plate. 

iii) Incubate at 37'' C for 2 hours. 

iv) Stain with 0.1 % Congo Red for 1 5 minutes. 

v) Destain with 1 M NaCl for 1 5 minutes. 

20 vi) Identify positives by clearing zone around clone. 

Nimierous modifications and variations of the present invention are possible in light 
of the above teachings and, therefore, within the scope of the appended claims, the 
invention may be practiced otherwise than as particularly described. 
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WHAT IS CLAIMED IS : 

1 . An isolated polynucleotide selected from the group consisting of: 

(a) SEQ ID NOS: 1-14 and 57-60; 

(b) SEQ ID NOS: 1-14 and 57-60, wherein T can also be U; 

(c) polynucleotide sequences complementary to SEQ ID NOS: 1-14 
and 57- 60; 

(d) polynucleotide sequences which encode an amino acid sequence as 
set forth in SEQ ID NOS: 1 5-28, and 61-64; and 

(e) fragments of (a), (b), (c) or (d) that are at least 1 5 consecutive 
bases in length and that will selectively hybridize to DNA which 
encodes a polypeptide of SEQ ID NOS:15-28, and 61-64. 

2. A vector comprising a polynucleotide of claim 1 . 

3. A host cell containing the vector of claim 2. 



4. The method of claim 3, wherein the host cell is a eukaryotic cell. 

5. The method of claim 3, wherein the host cell is a prokaryotic cell. 

6. A method for producing a polypeptide comprising: 

(a) culturing the host cells of claim 3; 

(b) expressing from the host cell of claim 3 a polypeptide encoded by 
said polynucleotide; and 

(c) isolating the polypeptide. 
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An en2yme selected from the group consisting of: 

(a) an enzyme comprising an amino acid sequence set forth in SEQ ID 
NOS: 15-28 or 61-64; and 

(b) an enzyme which comprises at least 30 consecutive amino acid 
residue as an enzyme of (a). 

An enzyme of which at least a portion is coded for by a polynucleotide of 
claim 1, and which is selected from the group consisting of: 

(a) an enzyme comprising an amino acid sequence which is at least 
70% identical to an amino acid sequence selected from the group 
of amino acid sequences set forth in SEQ ID NOS: 1 5-28 or 61-64; 
and 

(b) an enzyme which comprises at least 30 amino acid residues to the 
enzyme of (a). 

A method for generating glucose from soluble cell oligosaccharides 
comprising contacting a sample containing oligosaccharides with an 
effective amount of an enyzme selected from the group consisting of an 
enzyme having the amino acid sequence set forth in SEQ ID NOS: 15-28, 
61-63 and 64 such that glucose is produced. 

The method of cliam 9, wherein the sample is selected from the group 
consisting of dairy products, fruit juices, detergents, textiles, guar gum, 
animal feed, plant biomass and waste products. 

The method of claim 9, wherein the oligosaccharide is selected from the 
group consisting of maltose, cellobiose, lactose, sucrose, raffinose, 
stachyose, verbascose, cellulose, starch, amylose, glycogen, disacharrides, 
polysacharrides and puUulan. 
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MllTL CLYCOSIDASE - 2 9C 

COMPLETE GENE SEQUENCE - 9/9 5 

I i-li: AAA l-Tf .-If AAA .lAf m AT.. ATA O;.* TA. PA Ti T TfA lie: Trr I AA TTV . :aa . 1 

I H.^, uyy, , .y,; AS-.. rl». H... I I.- Kly Ty, u.. r;.., j:.-, ,M M ri,,. .:i„ r,M. ai.. 



1.1 .urr AIT . c-,- (:r;c; Trr- ..ai: «:at . t i: aat ai.t .:at rr:?; rt;*; trrA n:i; isn: i at i:at t;„ 

.:i *:iy ivr* cly S.r Clu Asp l>... A^n S.t Asp Tip Trp V^l Trp Va J m , n asp i:i 
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AAt: ai.a iua crrr nnA err a <nr Atu* iwt n-r irtt- cai; aac tax: cx*a rt-r 

4 1 Ar:n Th, Al.i aIa Cly i.^u I S.-r i:iy Ar.p I>l»o Pro CIu Asn Cly l^ro Cj'y Ty, T, p Asn 



TAf" na: aat I hp 
1.0 



lei l-TA AAC CAA AAT CAC CAC CAC CTt: CCT UAC AAC CTC CCC CTT ACT ATT A(;A CTA .:cr 240 

61 t.,.M. Asn Gin A^n Asp His Asp Urn AJa c:lu Lys Leu Cly Val Asn .Thr He Aro Vol cly eo 

T[ ^t*" T'^ ^ ^""^ ACT TTC AAT CTT AAA CTC CCT CTA CAC ACA 300 

81 val Clu Trp Ser Ar« He Phe Pro Lys l>ro Thr Phe Asn Val Lys Val Pro Val Ciu Arg 100 

301 CAT CAC AAC CGC ACC ATT CTT CAC CTA CAT CTC CAT CAT AAA GCC CTT CAA ACA CTT CAT 3fiO 

101 ASP Clu Asn Cly Ser He Val His Val Asp Val Asp Asp Lys Ala Val Clu Arg Leu Asp 120 

361 CAA TTA CCC AAC AAC CAC GCC CTA AAC CAT TAC CTA CAA ATC TAT AAA CAC TCC CTT CAA 420 

121 Clu Leu Ala Asn Lys Clu Ala Val Asn His Tyr Val Clu Met Tyr Lys Asp Trp Val Glu ' 140 

421 .ACA CCT ACA AAA CTT ATA CTC AAT TTA TAC CAT TCC CCC CTC CCT CTC TCG CTT CAC AAC 480 

14 1 Arg Cly Arg Lys Leu He Leu Asn Leu Tyr His Trp Pro Leu Pro Leu Trp Leu His Asn 160 

4 81 CCA ATC ATC CTC ACA ACA ATC CCC CCC t;AC ACA CCC CCC TCA CCC TCG CTT AAC GAG GAG 54 0 

161 Pro lie Mec Val Arg Arg Met Gly Pro Asp Arg Ala Pro Ser Cly Trp Leu Asn Glu Glu J 30 

541 TCC CTC CTC GAG TTT GCC AAA TAC GCC CCA TAC ATT GCT TGG AAA ATC GCC GAG CTA CCT 600 

181 Ser Val Val Glu Phe Ala Lys Tyr Ala Ala Tyr lie Ala Trp Lys Hec Cly Glu Leu Pro 200 

601 CTT ATC TCG ACC ACC ATC AAC CAA CCC AAC CTC CTT TAT GAG CAA CCA TAC ATG TTC GTT 660 

201 Val Met Trp Ser Thr Met Asn Glu Pro Asn Val Val Tyr Glu Gin Cly Tyr Met Phe Val 220 

661 AAA CCC CCT TTC CCA CCC CCC TAC TTG ACT TTC CAA GCT GCT CAT AAC CCC ACG AGA AAT 720 



221 Lys Cly Cly Phe Pro Pro Cly Tyr Leu Ser Leu Glu Ala Ala 



Asp Lys Ala Arg Arg Asn 240 



721 ATC ATC CAC GCT CAT CCA CCC GCC TAT CAC AAT ATT AAA CCC TTC ACT AAC AAA CCT CTT 780 
241 Met He Gin Ala His Ala Arg Ala Tyr Asp Asn He Lys Arg Phe Ser Lys Lys Pro Val 260 



781 CCA CTA ATA TAC GCT TTC CAA TCG TTC CAA CTA TTA GAG GCT CCA CCA CAA CTA TTT GAT 84 0 

2 61 Gly Leu He Tyr Ala Phe Cln Trp Phe Clu Leu Leu Clu Cly Pro Ala Clu Val Phe Asp 2 80 

841 AAC TTT AAC ACC TCT AAC TTA TAC TAT TTC ACA CAC ATA CTA TCC AAC OCT ACT TCA ATC 500 

281 Lys Phe Lys Ser Ser Lys Leu Tyr Tyr Phe Thr Asp He Val Ser Lys Cly Ser Ser He 300 

901 ATC AAT CTT CAA TAC ACG AGA CAT CTT GCC AAT AGG CTA CAC TCG TTC CGC CTT AAC TAC 960 

301 He Asn Val Glu Tyr Arg Arg Asp Leu Ala Asn Arg Leu Asp Trp Leu Cly Val Asn Tyr 3 20 

961 TAT ACC CCT TTA CTC TAC AAA ATC CTC CAT CAC AAA CCT ATA ATC CTC CAC CCC TAT CCA 1020 

321 Tyr Ser Arg Leu Val Tyr Lys He Val Asp Asp Lys Pro He He Leu His Gly Tyr Gly 34 0 

1021 TTC CTT TCT ACA CCT CGC CCC ATC ACC CCC CCT CAA AAT CCT TCT ACC CAT TTT CCC TCG 1080 

341 Phe Leu Cys Thr Pro Cly Cly He Ser Pro Ala Clu Asn Pro Cys Ser Asp Phe Gly Trp 36 O 

1081 CAC CTC TAT CCT CAA CCA CTC TAC CTA CTT CTA AAA CAA CTT TAC AAC CCA TAC CCC CTA 1140 

361 Clu Va) Tyr Fro Clu Cly Leu Tyr Leu Leu Leu Lys Clu Leu Tyr Asn Arg Tyr Cly Val 380 

1141 CAC TTC ATC CTC Al C CAC AAC CCT CTT TCA CAC ACC A(.*t; CAT CCC TTC AGA CCC CCA TAC 1200 

381 Asp Leu He Val Thr i;Ju Asn Cly VaJ Ser Asp Ser Arq Asp Ala Leu Arg Pro Ala Tyr 40O 

1201 CTC CTt; TCC CAT tTTT TAC ACX CTA TKJC; AAA CCC CCT AAC t'^C CCC ATT CCC CTC AAA GCC l2hO 

<01 |.,Mi Vol r.i^i His VAt Tyt Scr Val Trp Lys Ala Ala Asn c:(u cly Me Pro Va I Lys G(y 4:M) 

TA<* CTt* I'Af nu: Ai:c ttc; aca cac aat TAr CAi; -na; tu r < a*; ixjc Tn* acc: cai; aaa Tn- i i.:*' 

421 Tyr lu w ui:i T.,, s,., rhr Asp As., Tyr Ulu Tt p Al., t:|„ t:ly H»*. Arg l.ys fh. A U* 



Figure la. 
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OCl/4 CLYCOSIDASE - 3 3C/B 
COMPLETE CENK SCQUCKCE " 9/95 

f»J »*AC ATT /?*A i-i— 'yr 20 



130 



- ;n T. - z - T, - - 
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Figure 8a. 
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Figure 9^. 
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Figure 9W( Continued) 
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Figure lOo. 
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Figure 10|K Continued) 
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Figure llo- 
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Figure 110( Continued) 
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^^89 1098 
Phe Lya Gly Val i>^^ ^ - 

J ...... 

GAT GGC ATG CCC GTC Acr r-mn. ^i""* 1179 

^ "•'^ GAT ATC GGC TfM r-i* iX/» llflo 

«iC TGG GAA GTC TAT CCC CAG GGA ATT 

A3P Gly Met Pro V,l s.r Asp ill ^1 - 

.0. J^"''""'''°°'°"'"•- 
F r --- ^"^ - - - 

^ «. V. ^^^^ ^ ~. --. ... ... - 

•^251 2^260 T>efl 

GGT GTT GCG GAT rrr nr^ " 3.278 1307 

~l ^ ^ ACG CTX; AGG CCA TAG TAC ATA^ ^nr^ ^^^^ 

Gly Val AT* » I GTC AGC CAC GTC 

^ Vaa Ala Asp Ser Ala Asp rhr t> 

~P inr Leu Arg Pro Tyx Tyr II*- v^-i c 
^3^^ ^ -^-^^^ Ser His Val 

s~ ?^ ~ ~ ^- r-'^ - - - 

„. ^ ^ --- --- -.- ^ „. ^ 

1*22 1 ill 

^ ----- ^ !!! - - - 

^" - - - - - - s„ ~ ~ ~ ~ ~ ^ 

-.p^.C.A.A.G.eAG^J^ 

...^^^ A.. A., .1. 3^ ^„ ~- ... 

TTC CTG AAO OGt'oJS 3. 
Glu Phe teu Lys cly clu gIu Z^s 



Figure 
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OC1/4V XndoglucMM^ (330P1) 



' 18 27 36 



ATC GTA ^ AGA CAC TTC ACA TAT OTT CTT ArP "^C ACC ^ CTT A^^ 

Met vai Giu AT. His Arg Tyr cii z:: xi; ^ ^j;; z;;] — 

™ t!! ^ ^ cca aac aoa cm. ill 

I^u .eu Xle sex Se. T.U. OIn Cy. Gly III Z'n gIu i;!; l^s Zl Gil L'^ . 

il7 126 144 

^ "! !!! !^ ^ ^ ^ GCA S GAA TAc 

Ser Met Gl« Gin Ser Val All 7x1 III 'a^L IZ IZ 'r;^ l" 

180 IBS iQo 

^ -!!!!*!!! ^ ^! <=CT ™ S rrc li^ 

l^y^ Met V.1 Gly Lys Cly Val L'^ IZ ~ i;!! ~ ""^ 
22S 234 ,52 

!f! !!! !!!^ !!^ ^!! ^ ^ ^ ^to 

Gly Ala TTP Gly Val Arg He Glu cl^ ^ ^i;] 7ll IH l" lyl ~ 

279 288 297 3nK 

^ ^ cat S gaa IIg 

Phe ASP ser Val Ar, ii; III HI z;;^^^zsi:^^:;z-^^::.^^ 



333 342 351 



CCA CCA TAT GAT ATP GAC AGG 



360 



378 



AAT TTC CTC GAA AGA GTT AAC CAT GTT GTC GAT 



Pro Pro ,Vr Asp He Asp Arg Asn Phe Leu Glu Arg vll Z'n vll V^i 

^^'^ 356 405 AiA 

AGG GCT CTT GAG AAT AAT TTA ACA C^A AOC AOC AAT ACG CAC "t «T GAA 

Arg Ala Leu Glu ?^ ^ tZ vll III HI Zl Zl Zl III HI HI 

450 459 .^^ 

^ !-! ™ ^ !!!!!!! !:^^ «^ ^ CAA ^ aca cll 

Lou Tyr Gin Glu Pro A,p Ly, gi'^ Zl Zl Zl Zl Zl Zl ZZ Zl Zl 

504 513 
ATT GCA AAA TTC TTT AAA GAT TAC ccr rax ^ „ 

, _ _ CCG GAA AAT CTG TTC TTT GAA ATC TAC AAC 

Xle Al. Uy. Phe Phe Lys Asp t;^ ^i^ Z'n 'iZ'u III III gIu 'r^l HI 



Figure 130^ 
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CAG CCT OCT CAO rrx; .CA GCT ^ ^ ^ ^ ^ " S 

^1 I '•AT CTA AAA crrc 

Gl« Pro Ala Gin A.n Leu Thr Al« oi. Ly, "I T: " " ^ 

g„ '^'^ ^" 'Vr Pro Lys Val 

~ ~ ~ 1^1 ^ - 

- V. „. ^ ~- ~ ~ ~ --. ... £ 

- 30= OCX ^3 ill ^ ^ - ^ ^ ^ ^ 

Asn Trp Ala His Tyr Ser Ala Val IZ' Z 

TVr Al- val AT. Ser U.u Ly. Val A.n A.p ^yl ^ 

75a 

-!f ^!! !!!!!! ----- ^ - ^ - ^ ^ ^ 
... v« s„ ^ ^ ~ ~ - ~ --- ... „. „. 

765 774 ^ 

^ V. ^ ^ „. ^, ~ ~ ~ ... ... ... ... ^ 

828 fl:iT 

«. ^. o.. „. ^ ^. ~ ~ ~ --- --- ^. ... 

"! 5t ^ IS IJI 

v.. ^. „. ^ ^ ^; ~ ~ --. ... 

^2"' 536 

!!! ^ ------ .X. ccc IJJ c« „ S 

s„ ^, ^ _ ... --- ... „ ... 

- s.. ^ „. ^ ^ ^ ^ --. --- ... ~ 

^^35 10^4 rnti-k 

pr ^ ™ ^ - -'SS c„ ^'J^ 

^ ^ „. ^„ ~ ~ --- ... .„ ... 
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5 18 

ATG a.r CTT ACA CTC ccc A«: .11 ^ ^ ^ -rScocc.^ 

Me. ^ Leu Thr Ly, Val Cly n; 7.1 IZ HI IZ III III 

™ !!! !? !!^! ^'•^ ^ «A c^c CO. ;uvc oS 

A.P VaX AXa Ly. Asp Ar, Phe xie xx"; III Z; H'y ly, '^l HI ^ ^ 
ATA CTC ^ CCA CTC SI CAC Arr S| TAC CAA CCA CAC S ^ CCC 1^ 

lie r^u Gi. v.i Giu '^i 'r^ oxu' [~ i;!; III — ~- 

189 19a 

-I! !!!!!! !!!!!! CAG Gcr S Acc iix 

lie Phe Phe Ala Gin Ala Ar. Ser As'^ I7s 71 III clu i:!! Zl 7^ Z: 
225 234 243 35-? 

^ -G Gx^ ACT ^ 

pro val ASP Thr Lys Ly, .y. c31u I.u Pl,e 71 7Z Z~ 

279 288 297 

-"!!!!!! !!!^ f!!-* ^ «^ CCC a4 GAC ATA g™ ACG 

li e Pro val S er Ax, V.I clu Ly^ Zl Zl 71 7Z 71 ,Z 7'. 71 7^ 



!^!! ^ A- ^ -X CAC ^ AAA SI 

Tyr val AT, lie v.l l^u Ser Glu Ser Zl Z's 77 7x1 Z; Zl Zl 17 Zl 
387 396 

^ !!:!^ !!! =CA aS GTc ATc S A,G GAG 

val Glu Leu He He Glu Gly Tyx Ly^ 71 Zl Zl 77l 



441 450 



459 468 



He Met Met Glu Il« 

!^ !!!! !!^! ™! ^ "C GGA ^ GTA TAT S CCA GAG "I 

X-u ASP A3P ,Vr TVr oyr Asp Cly Gl^ Zi 7^ Zl 71 77 71 71 77 71 

504 c,, 
ACG A« CJC TGG TCC CCC err ^ AAG 1^ GTA AAG 1^ C„ CTC ^ 

Thr lie PHo AT, Val Trp Ser Pr; 71 71 71 7; 71 71 71 Zl Zl 71 

Figure 14^ 
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-.«oto.. ^X1«X„... ,.„3, ,eon..„«.., 

*™ TAC AAG GGA 

Lys A.„ Gly Glu Asp c51u Pro OVr Gin 'vll ~ "I Zr " - 

val Asn Met Glu Tyr Lys Glv 
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- v.. ^ - ~ ~ ~- ~ --- -.- ... ^ ^. 

666 fi7c 

!!!!!! ~ S ™ - 

Tyr Gin Leu GXu Aan TVr Glv i^, ti T 

Tyr Gly r.y, XI. Arg Thr v,l Asp Pro Tyr Ser t,;! 

720 70ft 

^ !!! occ ^ e„ - ^ ^ - 

V.X X„ ^ ^ ^ ~ ~ ~- ... ^ „. 

765 

!^ - - ~ - - OCX ceo ^ ^ - _ - 

... ^ ^ ^, ~ ~ ~ - --- „. ... „. 

828 837 

.« - - C.C ^ ^ ^ jj; ^ ^ «j ^ ^ 
XI. X.. ex. „. „. «. ;^ i~ ~ ~ ~ --- 

882 OQi 

^ ^ !ff - - ™ - -cc c« i;:: ^ _ - ^ _ - 

^» .V. ^ ^ _ ~; ~ - -~ - ---- ... ... 

^27 

^ ^ «C C33 _ ^ - ^ ^ ^ ^ 

«v v.. ^ ^ 3., ~ ~ ~ ~ --- --- ^ .-- ... ... 

981 990 
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«c ^ ^^x«, ^ 
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Figure 14b( Continued) 
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1098 1107 n,r 

T« .CC «T CCC ^ C« C« ^ ^ ^'J^ ^ 
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VI r,. «. ^, --- --- ~ --- ... ... ... ... 
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CTC OCX xoc ^ «^ ecc atc x,. xcx xxx'^^^ ^ 

Va. IX. ser CIu X., Me. Z;," ^I." ^ ^ ^ ~ 
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TXC _-C -A XXC OXC TXT CXC XTX CXC CCX TKt'J^ t^ 
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1467 1476 1485 

AC. CTC X.C ==C CC= ^ ^ 

n-r II. II, ^ ^ ^,y ^„ ^ ~ ~ --- ~ --- ... .„ 

1521 1530 1539 .^.^ 

aiy .y3 3.r xsp vax xx» ex. ^ ^ ^ ^ - - - ~- 
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Figure 14C( Continued) 
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1638 1547 



o.. ^ . c. ^ ~ --- --. ... ... 

1692 ^-T«n 
GAC GGA AAA CTC ATr »^ ^ ^^710 1710 

--. !! *!! 13! ^ c„ c« ,=« <^^IJ| „. 

A3P Gly Ljr, Leu He Lya Ser Ph« a, r 

- _^ ^ ^ - ^ ^^J^ ^ ^^.r. 

Ai. cy »i. A,, x,„ „i, ^ ~ ~ -- -- ! !!! 

^ *^ I-- M. Al. Ly. 



~ ^- ^. - - SI AAA «c'S AAa'IJI 

- ... ^ ^ ~ ~ ~ ~ ... 

18^5 1854 lofi, 

fr! - - - - -'cS ^ ^-j 

°" :: ^ - - ^ - - - ^r; 

1899 1908 

Asp Ph. cy- Arg Thr Tlur Asn pk" T - 

1953 1552 
^ !?f CAC^J?? A.A CAC^- ^ 

«. A, ^ ^- ~ ~ --- .„ ... 

AAC^S; C3. A^^SJ ^ A^'S; ^AA CAC^S ^ _ ^^^^ 
..y Oly L,™ „. ^„ ^ ^ ~ ~ --- -- 

^^^^ ^-^^ ^^'^ Arg Leu Lys Asn 

CAA ^ ^ ^ 

«. <u„ ^^^.^ ~ ~ --- ^. ... ... 

CCC 33c A^S AAa'J;^ ^^ilJ _ „ - _ 

... ^ ^ „. ~ --. ._ ... ... 

figure 14d(Continued) 
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2169 2178 71117 

c« ^ ™ C« ^ ^ ^ ^ ^ 

II. x.„ dy oi„ L„ ^ ~ - ~ --- 

2223 2232 2241 

!!!!!!!!! ^ff -c ,c.^l^ ™ ^ 

"° ""J'' ^" 

oiv Tto II. CI. »^ p„ - ~ ~ ~ -- 



Figure 14e( Continued) 
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.s. „^^, ^ ^^^^^ ^^^^^^^^^^^ 



1 



CTT TTA TTG ATC GIT GAG CTC TCT TTC GTT Pro ^ 

- n. J - - - - 

ys Phe Ala Leu Asn Gly Lys gIu Phe Arg Phe 
GGT rrc CTC GAC GGG GAG AGT TAC TGC Ar^ r^. 

Oly Phe Leu ASP ciy gIu Se. Tys T ™" 

r Tyr Cys Arg Asp Lys Asn Thr Tyr Met His 

GTC AAT ACC TAC ACG GGA GTT CCT TAC AGG GAA n.^ 

Val Asn Thr Tyr Thr Gly Val Pro Z T '''''' ""^^ 

ly val Pro Tyr Arg Glu Glu Pro Thr He Met Ala 

^-^^■^"zzzzszczzzz 
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CTC GTT GAG TGG OTG AAG GAG ATG AGC TCC TAC ATA AAG AGT CTG GAT CCC 
Leu Val Glu Trp Val l,ys Glu Met Ser Ser Tyr He i,ys Ser Leu Asp Pro 

AAC CAC CTC GTO GCT GTG GGG CAC GAA GGA TTC TTC AGC AAC TAC GAA GGA 
Asn Hxs Leu Val Ala Val Gly Asp Glu Gly Phe Phe Ser Asn Tyr Glu Gly 

TTC AAA CCT TAC GGT GGA GAA GCC GAG TGG GCC TAC AAC GGC TGG TCC GGT 
Phe Lys Pro Tyr Gly Gly Glu Ala Glu Trp Ala Tyr Asn Gly Trp Ser Gly 

GTT GAC TGG AAG AAG CTC CTT TCG ATA GAG ACG GTG GAC TTC GGC ACG TTC 
val Asp Trp Lys Lys Leu Leu Ser He Glu Thr Val Asp Phe Gly Thr Phe 

CAC CTC TAT CCG TCC CAC TGG GGT GTC AGT CCA GAG AAC TAT GCC CAG TGG 
H.S Leu Tyr Pro Ser His Trp Gly Val Ser Pro Glu Asn Tyr Ala Gin Trp 

GGA GCG AAG TGG ATA GAA GAC CAC ATA AAG ATC GCA AAA GAG ATC GGA AAA 
Gly Ala Lys Trp He Glu Asp His He Lys He Ala Lys Glu He Gly Lys 

CCC GTT GTT CTG GAA GAA TAT GGA ATT CCA AAG AGT GCG CCA GTT AAC AGA 
Pro val val Leu Glu Glu Tyr Gly He Pro Lys Ser Ala Pro Val Asn Arg 

-ACG-GCC-ATC-TAC-AGA-CTC-TGG-ASC-GAT-CTG-GTC TAC GAT CTC GGT GGA GAT 
Thr Ala He Tyr Arg Leu Trp Asn Asp Leu Val Tyr Asp Leu Gly Gly Asp 

GGA GCG ATG TTC TGG ATG CTC GCG GGA ATC GGG GAA GGT TCG GAC AGA GAC 
Gly Ala Met Phe Trp Met Leu Ala Gly He Gly Glu Gly Ser Asp Arg Asp 

GAG AGA GGG TAC TAT CCG GAC TAC GAC GGT TTC AGA ATA GTG AAC GAC GAC 
Glu Arg Gly Tyr Tyr Pro Asp Tyr Asp Gly Phe Arg He Val Asn Asp Asp 

AGT CCA GAA GCG GAA CTG ATA AGA GAA TAC GCG AAG CTG TTC AAC ACA GGT 

Ser Pro Glu Ala Glu Leu lie Ara nin , t 

xxe Arg Giu Tyr Ala Lys Leu Phe Asn Thr Gly 

GAA GAC ATA AGA GAA GAC ACC TGC TCT TTC ATC CTT CCA AAA GAC GGC ATG 
Glu ASP He Arg Glu Asp Thr Cys Ser Phe He Leu Pro Lys Asp Gly Met 

GAG ATC AAA AAG ACC GTG GAA GTG AGG GCT GGT GTT TTC GAC TAC AGC AAC 

Figure 15t (continued) 
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Glu He Lys Lys Thr Val Glu Val Arg Ala Gly Val Phe A ^ 

« v,iy val Phe Asp Tyr Ser Asn 

ACQ TXT GAA AAG TTG TCT r-rn 

W VU Glu J=p 
ATA OAO CAT CTC GGA TAC GGA ATT TAH ar-^ ™. 

n. =iu Hi. T.. cax x" ^ Z Z r 

T/r Gly Phe Asp Leu Asp Thr Thr Arg 

CTC GCA GAG GAA GTT GAT TTT -rnn ^r^n. ^ 

0. V. - iz z z z z z z z z z 

AAC A=C ACC TC= CO CCA GAO TTC C3C TCA CCT CAC 

- S„ c.. ™, ^ - - ATT OAA T^ ^c 

Ala Leu Gin Leu Asn Val Lys Lgu Pro Gly Lys 
AGC GAC TGG GAA GAA GTG AGA GTA era jvr-r- 

3er ASP rrp Glu Olu Val ^ vll T 1 ^ 

Arg val Ala Arg Lys Phe Glu Arg Leu Ser Glu 

z z tH z: z z z z 7 - - - - - - 

Tyr Asp He Tyr He Pro Asn Val Glu Gly Leu Lys 
o Tyr Ala Val Leu Asn Pro Gly Trp Val Lys He Gly 

Figure 15C(continued) 
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GGG GTG AAA GAA CTT CAC ATA GGA GTT 
Gly Val Lys Glu Leu His He- Gly Val 

GGA CCG ATT TTC ATC GAT AAT GTG AGA 
Gly Pro lie Phe lie Asp Asn Val Arg 

TGA 1991 
END 



GTC GGT GAT CAT CTG AGG TAG GAT 
Val Gly Asp His Leu Arg Tyr Asp 

CTT TAT AAA AGA ACA GGA GGT ATG 
Leu Tyr Lys Arg Thr Gly Gly Met 



Figure 1 5 ci( continued) 
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1 ATG AAA AGA ATC GAC crr n^-n 

Val Arg Asp Asn Glu OXy Arg Phe Ser 

61 TTT C»A GGG ACT GTG CCA nrr ^ 

181 TAG GAG AGG GAG TTC Par -r^r, 

P^o Lys Thr Leu Glu Gin A.n Tyr Gly 140 

421 GTC CTC GGC GGT CCT GAA fa- 

Val ... - .^C AGA GGA ^C A^ AGA AAA GCC CAG .A. .CG .AC .30 

/ Tyr He Arg Lys Ala Gla Tyr Ser Tyr xso 

"1 S T^ r ^ ^'^^ ^^-^ ^ '"'C OGT ATT TGG AAA . 

Trp ASP Trp Gly Ala Arg He Val Thr Ser Gly xiT ^ T '^'^ "° 

ly He Trp Lys Pro val Tyr X^u Glu lao 

u„ .„ ^ ... - - - - c« ^ „r ^ _ ^ ,„ 

y Glu Gly Asn Leu He Val Glu Val Tyr 220 

GTA AAC GGT GAA AAG ATA GGG GAr 

v., ^, - - - ^ „e ^ ^ _ 

••)■. Phe 2.0 

^ ^ ^^'^ val Gly Lys Pro 260 
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Phe val Leu Lys Asp Leu Asn Gly Glu He Tyr Arg Glu Glu 



640 
280 

900 

u Pro Asp Glu Glu Gly Lys Thr 3 00 



320 



^'^'^^^zzzzzzzzzzzzzzz 
- z z zzzz z z z z z z z z z z z z - 

Z T T GCG TGX CXX .X.O 

A^ Leu cys ..p GXu Leu Gly X.e Me. V.Z oXn .sp P.e Met .Xa Z Z 

Tyr P.o ASP H.S Leu Pro Trp Phe Arg Lys Leu Ala Asn Clu Glu Ala Arg Lys He 400 

Zl f° ^ CCC TCC Ain- GTT CTC TGG TGC 03.. A.C ...C CAA AAC AAC 126 

-1 val Arg Lys Leu Arg Tyr His Pro Ser lie Val Leu Trp Cys Gly Asn Asn G^ 1 



0 

Asn 420 



0 
440 



'421 Z T r ^ '^'^ "'^^ CTC GGA AAC 132 

Trp Gly PHe Asp Glu Trp Gly Asn Met Ala Arg Lys Val Asp Gly Xle Asn Leu Gly Asn H 

'-'x Z Zl Z r r ^ "ao 

Arg Leu Tyr Leu Phe Asp Phe Pro Glu lie Cvs Ala ri» ri. 

x±e i,ys Ala Glu Glu Asp Pro Ser Thr Pro Tyr 



460 



'''x T^ To r r '^'^^ AGC GAA AAG GAA GGA GAC AGO CAC X..0 

Trp Pro ser Ser Pro Tyr Gly Gly Glu Lys Ala Asn Ser Glu Lys Glu Gly Asp Arg His 1 

Trp Tyr Val Trp Ser Gly Trp Met Asn Tyr Glu Asn Tyr Glu Lys Asp Thr Gly Arg 500 

'box ne ser Gl! T T T ^'^^ ^^'^ TTC TTT TCA X560 

ser Glu Phe Gly phe Gin Gly Ala Pro His Pro Glu Thr He Glu Phe Phe Ser 520 

1561 AAA CCC GAG GAA AGA GAG ATA TTC CAT CCC GTr ^-vn r~r.r~ 

52X Lys Pro Glu Gl„ » , AAG CAC AAC AAA CAG GTG GAA 1620 

Glu Glu Arg Glu He Phe His Pro Val Met Leu Lys His Asn Lys Gin Val Glu 540 

Figure 16b(continued) 
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0 
580 

1800 
600 



1B61 GCG AGA AGA TTC TTC GCT r^n ^ 

i-eu ser Gin Ala Cys Ser Leu 650 

1981 CGA GAA GAA GGG AGA AAA GGT ATT CGA aa. 

^ - ™ r r === - 

u Gin Asn Gly Thr Pro Ser Arg Arg 680 

2041 TGT GAG TTT GGT TGA 20S5 
681 Cys Glu Phe Gly End 685 



Figure 16C(continued) 
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Figure No. lli-Bankia gouldi (379p4) 

1 ATG AAA AAA AAT CTA CTA ATG TTT AAA AGG CTT ACG TAT CTA CCT TTG TTT TTA ATG CTG 60 

1 Met Lys hys Asn Leu Leu Met Phe Lys Arg Leu Thr Tyr Leu Pro Leu Phe Leu Met Leu 20 

61 CTC TCA CTA AGT TCA GTA GCT CAA TCP CCT GTA GAA AAA CAT GGC CGT TTA CAA GTT GAG 120 

21 Leu Ser Leu Ser Ser Val Ala Gin Ser Pro Val Glu Lys His Gly Ajrg Leu Gin Val Asp 40 

121 GGA AAC CGC ATT CTT AAT GCG TCT GGA GAA ATT ACG AGC TTA GCT GGT AAC AGC CTC TTT 180 

41 Gly Asn Arg He Leu Asn Ala Ser Gly Glu He Thr Ser Leu Ala Gly Asn Ser Leu Phe 60 

181 TGG AGT AAT GCT GGA GAC ACC TCC GAT TTT TAT AAT GCA GAA ACT GTT GAT TTT TTA GCA 240 

61 Trp Ser Asn Ala Gly Asp Thr Ser Asp Phe Tyr Asn Ala Glu Thr Val Asp Phe Leu Ala SO 

241 GAA AAC TGG AAT AGC TCA CTT ATT AGA ATA GCT ATG GGC GTA AAA GAA AAT TGG GAT GGC 300 

81 Glu Asn Trp Asn Ser Ser Leu He Arg He Ala Met Gly Val Lys Glu Asn Trp Asp Gly 100 

3 01 GGA A;^.T GGC TAT ATT GAT AGT CCG CAG GAG C,K^ GAA GCT AAA ATT AGA AAA GTT ATT GAT 360 

101 Gly Asn Gly Tyr He Asp Ser Pro Gin Glu Gin Glu Ala Lys He Arg Lys Val He Asp 120 

361 GCA GCT ATT GCT AAC GGC ATA TAT GTA ATA ATA GAC TGG CAC ACT CAC GAA GCA GAG TTA 420 

121 Ala Ala He Ala Asn Gly He Tyr Val He He Asp Trp His Thr His Glu Ala Glu Leu 140 

All IAC„ACA_GAT_GAG_GCT_GTT^GAC^TTT--TTT~ACG-AGA-ATG~ CTA-TAC-GGA-GAT-ACT-CCC ^48 0" 

141 Tyr Thr Asp Glu Ala Val Asp Phe Phe Thr Arg Met Ala Asp Leu Tyr Gly Asp Thr Pro 160 

481 AAT GTA ATG TAT GAA ATT TAT AAC GAG CCT ATA TAC CAA AGT TGG CCT GTT ATT AAG AAT 540 

161 Asn Val Met Tyr Glu He Tyr Asn Glu Pro He Tyr Gin Ser Trp Pro Val He Lys Asn 180 

541 TAT GCA GAG CAA GTA ATT GCT GGT ATA CGT TCT AAA GAC CCA GAT AAT TTA ATA ATT GTA 600 

181 Tyr Ala Glu Gin Val He Ala Gly He Arg Ser Lys Asp Pro Asp Asn Leu He He Val 200 

601 GGT ACT AGC AAT TAT TCT CAG CAA GTT GAT GTA GCA TCA GCA GAC CCA ATA TCT GAT ACT 660 

201 Gly Thr Ser Asn Tyr Ser Gin Gin Val Asp Val Ala Ser Ala Asp Pro He Ser Asp Thr 220 

661 AAT GTG GCA TAT ACT TTA CAT TTT TAT GCA GCA TTT AAC CCG CAT GAT AAC TTA AGA AAT 720 

221 Asn Val Ala Tyr Thr Leu His Phe Tyr Ala Ala Phe Asn Pro His Asp Asn Leu Arg Asn 240 

721 GTA GCA CAG ACA GCA TTA GAT AAT AAT GTT GCT TTG TTT GTT ACA GAA TGG GGT ACA ATT 780 

241 Val Ala Gin Thr Ala Leu Asp Asn Asn Val Ala Leu Phe Val Thr Glu Trp Gly Thr He 260 
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Thr Asn Thr Trp Met Ala Phe Leu 330 

841 AAA GAA AAA GGT ATA AGT r^r. ^ 

P A.ys Ala Phe Pro GXu Thr 300 

901 GGG TCT GTA GTT CAA rna r^^^ 

- - - - - z - z z ™ z - r - - - 

^er Asn Lys Leu Thr Ala 

961 TCT GGT GAA ATT r-r* 

P TJir Glu Thr Ser Thr Gly Pro 340 

1021 AAA ACA ACA CAA Tr-r 

- =^ =^ - ™ - - - - r - z z zz - 2 T r - - 

i.ys lie GXn Gly Ala 380 

1"! rrr aac cgt agt gtt tac ctt tat ggt 

- - ^ V. - - ^ ^ .„ 

^^'^ Pro lie lie 4 00 

1201 TTA AGA GGC GAA aro r-r-v 

jr ^eu Asp Tyr Asn Asa Gly 420 

rp Asn lie Lys Asp He Glu Phe Lys Thr Gly 440 

' " *■« v« „.i 

J" 0°:: z r j;: ™ - - - - - ^ 

iy Ser ser Asn Aan Ser He Asp Gly 4 80 

Asp val Lys Glu Gly Thr Met Asn 540 
Wgure 17b (continued) 
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1S2X A^^^ATAAGAAATTCK:<n..TTTTCTGCAGAACX^ATTTCACK;AG« ,,,, 
S.X r.r xae XX. a., As„ C.s VaX P.e Ser AXa CXu OX, XXe Ser CXy OIu Asn S.. sTr Asp 



560 



X740 
0 

00 
600 



5.1 Ala Phe lie A=p .ys Gly Ala Tyr OXy Phe Val Tyr Arg Asn Thr Phe Asn Val Asp 58 

1741 «^ TCT OAA GTA ATA AAT ACT GGA GTA GAC TTT TTA GAT AGA GOT ACA GGA TTT ACA 18 

S81 Gly ser Glu Val lie Asn Thr Gly Val Asp Phe Leu Asp Arg Gly Thr Gly Phe Asn Th^ 

1801 TTT AGA AAT GCA ATA TTT GAA AAT ACA TAT AAC CTT GGC AGT AGA GCT TCA GAA ATT 1B60 

SOI Gly Phe ^ Asn Ala Xle Phe Glu Asn Thr Tyr Asn .eu Gly Ser Are Ala Ser Z Z 

ser Thr Ala Arg Ly, Lys Gin Gly Ser Pro Glu Gin Thr His Val Trp Asp Asn lie Arg 640 

1921 AAC OCT AAT TCT GTT GAT TTT CCA ATA ACT GAT GGT ACA GAA AAT CTA GTA AAT AAA TTC ISBO 
641 Asn Pro Asn Ser Val Asp Phe Pro He ser Asp Gly Thr Glu Asn Le-. Val Asn Lys Phe 



660 



1981 TGC CCA GAT TGG AAT ATA GAA CCA TGT AAT CCT GTA GAC GAA ACC AAC CAA GCA CCT ACA 2040 

661 Cys Pro Asp Trp Asn He Glu Pro Cys Asn Pro Val Asp Glu Thr Asn Gin Ala Pro Thr 680 

2041 ATA AGC TTC CTA TCT CCT GTT AJ.C AAT ATT ACT TTA GTT GAA GGT TAT AAT TTA CAA GTT 2100 

__,Sai__Ile-Ser-Phe-Beu Ser-Pro-Val-Asn-Asn-tle^hr Leu Val Glu Gly Tyr i^n l^u Gin Val ^ 

2101 GAA GTT AAT GCT ACT GAT GCA GAT GGA ACT ATT GAT AAT GTA AAA CTT TAT ATA GAT AAC 2160 

701 Glu val Asn Ala Thr Asp Ala Asp Gly Thr He Asp Asn Val Lys Leu Tyr He Asp Asn 720 

2161 AAT TTA GTT AGG CAA ATA AAT TCT ACT TCA TAT AAA TGG GGC CAT TCT GAT TCT CCA AAT 2220 

721 Aen Leu Val Arg Gin He Asn Ser Thr Ser Tyr Lys Trp Gly His Ser Asp Ser Pro Asn 740 

2221 ^ GAA CTT AAT GGT CTT ACA GAA GGA ACT TAT ACC TTA AAA GCA ATT GCA ACT GAT 2280 

Thr ASP Glu Leu Asn Gly Leu Thr Glu Gly Thr Tyr Thr Leu Lys Ala He Ala Thr Asp 760 

2281 AAC GAC GGG GCT TCT ACA GAA ACG CAA TTT ACG TTA ACT GTA ATA ACA GAA CAA AGT CCG 2340 

Asn ASP Gly Ala Ser Thr Glu Thr Gin Phe Thr Leu Thr Val He Thr Glu Gin Ser Pro 780 

ser Glu Asn Cys Asp Phe Asn Thr Pro Ser Ser Thr Gly Leu Glu Asp Phe Asp He Lys 800 

2401 TTT TCT AAC GTT TTT GAG TTA GGA TCT GGC GGA CCA TCT TTA AGT AAT TTA AAA ACA 2460 

Figure 17e.(contlnued) 
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801 Ly8 Phe Ser Asn Val ph- ri,. t 

P»« Glu Leu Gly Ser Gly Gly Pro .5*^ r o 

y fro ser Leu ser Asn Leu Lys Thr s2o 

2581 GCA AAT CCA GAA ATA TCT ATT AGC aa^ nr- 

s« Lys Thr Asn Asa Phe Thr He Tyr 900 

2761 ATT ACT GAT GAT rrr » 



Thr 94 0 



GT 2870 
956 



Figure 17cl(continued) 



3NSDOCrD: <WQ_9824799A1_IC> 



wo 98/24799 PCT/US97/22623 

45/46 



Figure No. ifliv Pyxococcus furiosus VC1(7EG1) 



leader sequence: amino acids 1-24 



5 ' 



^ " 27 3S 45 



54 



ATG AGC AAG AAA AAG TTC GTC ATC GTA TCT ATC TTA ACA ATC CTT TTA GTA CAG 
Met Ser Lys Lys Lys Phe Val He Val Ser He Leu Thr He Leu Leu Val Gin 

" ^= «1 50 99 

GCA ATA TAT TTT GTA GAA AAG TAT CAT ACC TCT GAG GAG AAG TCA ACT TCA AAT 
Ala He Tyr Phe Val Glu Lys Tyr His Thr Ser Glu Asp Lys Ser Thr Ser Asn 

1" "6 135 144 153 162 

ACC TCA. TCT ACA CCA CCC CAA ACA ACA CTT TCC ACT ACC AAG GTT CTC AAG ATT 
Thr ser Ser Thr Pro Pro Gin Thr Thr Leu Ser Thr Thr Lys Val Leu Lys He 

189 198 207 216 

AGA TAC CCr GAT GAC GGT GAG TGG CCA GGA GCT CCT ATT GAT AAG GAT GGT GAT 
Arg Tyr Pro Asp Asp Gly Glu Trp Pro Gly Ala Pro He Asp Lys Asp Gly Asp 

234 243 252 261 270 

GGG AAC CCA GAA TTC TAC ATT GAA ATA AAC CTA TGG AAC ATT CTT AAT GCT ACT 
Gly Asa Pro Glu Phe Tyr He Glu He Asn Leu Trp Asa He Leu Asa Ala Thr 



288 297 306 315 324 

GGA TTT GCT GAG ATG ACG TAC AAT TTA ACC AGC GGC GTC CTT CAC TAC GTC CAA 
Gly Phe Ala Glu Met Thr Tyr Asn Leu Thr Ser Gly Val Leu His Tyr Val Gin 

"3 342 351 360 369 378 

CAA CTT GAC AAC ATT GTC TTG AGG GAT AGA AGT AAT TGG GTG CAT GGA TAC CCC 
Gin Leu Asp Asn He Val Leu Arg Asp Arg Ser Asn Trp Val His Gly Tyr Pro 

3" 396 405 414 423 432 

GAA ATA TTC TAT GGA AAC AAG CCA TGG AAT GCA AAC TAC GCA ACT GAT GGC CCA 
Glu He Phe Tyr Gly Asn Lys Pro Trp Asn Ala Asn Tyr Ala Thr Asp Gly Pro 

«0 459 468 477 486 

ATA CCA TTA CCC AGT AAA GTT TCA AAC CTA ACA GAC TTC TAT CTA ACA ATC TCC 
He Pro Leu Pro Ser Lys Val Ser Asn Leu Thr Asp Phe Tyr Leu Thr He Ser 
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4 95 



TAT AAA CTT GAG CCC AAG AAC GGC CTG CCA ATT I 
^ .eu Glu .ys Asn Gl, To x" T 

He Asn Phe Ala lie Glu Ser Trp 



567 



TTA ACG AGA GAA GCT TGG AGA ACA ACA GGA ATt 

Leu Thr Arg Glu Ala Tro . "^"^ ^'^^ GAT gag CAA GAa 

a "xu Aia Trp Arg Thr Thr Glv ti« n <5TA 

Oly lie Asn Ser Asp Glu Gin Glu Val 

612 621 
ATG ATA TGG ATT TAC TAT GAC GGA TTA CAA err 

- - - - - - J - 

666 

ATT GTA GTC CCA ATA ATA GTT AAC GGA 702 



720 



TGG AAG GCA AAC Arr GGT TGG GAG TAT GTT GCA ^ 

.xs Ala A.n lie Gl. Trp Glu T.r vIT a.^ ^ T 

nc Arg lie Lys Thr Pro He 



'74 783 



AAA GAG GGA ACA GTG ACA ATT CCA TAC GGA rr. ^lO 
Olu Gl, Thr Val Thr Xle Pro ^ ^ p^ T 

Ala Phe He Ser Val Ala Ala Asn 

ATT TCA Igc TTA CCA Zl TAC ACA G^ CTT TAC ^A GAG 
Xle ser Ser .eu Pro A.n Tyr Thr Glu .eu ^ T 

xyr Leu Glu Asp Val Glu He Giy 



882 



ACT GAG TTT GGA ACG CCA AGC ACT ACC TCC Ocr '"^ 
--r Glu Phe Gly Thr Pro Ser Th" tL 1 h T 

^ Trp Trp He Thr 



927 

954 



936 



z z z r r - - - = , 

Arg Pro Leu He ser * 



Figure 18b(concinued) 
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